perf(zero_trust_list): optimize large list performance with hash-base…#6901
Open
rexscaria wants to merge 12 commits intocloudflare:mainfrom
Open
perf(zero_trust_list): optimize large list performance with hash-base…#6901rexscaria wants to merge 12 commits intocloudflare:mainfrom
rexscaria wants to merge 12 commits intocloudflare:mainfrom
Conversation
…d comparison Lists with 2000+ items were taking 70+ seconds to plan due to Terraform's expensive set comparison for nested objects. This optimization: 1. In Read(): Preserve state items instead of replacing with API items 2. In ModifyPlan(): Compare items using fast SHA256 hash (~0.5ms for 5000 items) instead of Terraform's set diff (~70s for 2000 items) Performance improvement: ~70,000x faster for no-change plans on large lists. Trade-off: External drift detection on individual items is reduced, but Terraform config remains the source of truth.
c3e165a to
568b570
Compare
added 11 commits
March 24, 2026 10:45
- Fix Read() to fetch items via GET /gateway/lists/{id}/items so external
drift is detectable; removes the existingItems state-preservation workaround
- Fix items-fetch error handling to only remove resource from state on 404;
transient errors (timeouts, 5xx, 429) now surface as proper diagnostics
- Fix Description field to use types.StringNull() when absent in API response,
preventing perpetual diffs on ImportStateVerify
- Fix ModifyPlan to guard against null state (first create) in addition to
null plan (destroy)
- Fix Delete to remove spurious resp.State.Set() call (no-op in framework)
- Fix all io.ReadAll errors silently discarded with _
- Fix ListExact/ListSizeExact -> SetExact/SetSizeExact in tests (items is a
SetNestedAttribute; integer index paths don't apply)
- Fix TestAccCloudflareTeamsList_Update to use ConfigStateChecks + SetExact
instead of broken items.0.value index-based checks
- Replace null-byte separator in computeItemsHash with length-prefixed
encoding to eliminate field/entry boundary collision classes
- Add modify_plan_test.go with unit tests for all ModifyPlan paths
- Add buildPlan/buildNullPlan helpers; remove reuse of tfsdk.State for plans
- Add teamslistconfig2000items.tf dedicated test template
- Strengthen SameItems/ReorderedItems test assertions to verify plan raw
equals state raw after suppression, not just non-nil
…elpers Reduces repeated boilerplate across Create/Update/Read/ImportState without changing any behaviour.
Move all hand-written logic (fetchItems, suppressItemsDiff, computeItemsHash, helpers) into items.go so it is clearly separated from the generated resource.go. resource.go now has two minimal touch points: - Read() calls fetchItems() — one line - ModifyPlan() delegates to suppressItemsDiff() — one line This makes the generated file easy to re-apply on future regenerations without losing any of our custom behaviour.
An empty list (0 items) and a deleted list (404) both previously returned (nil, nil) from fetchItems, causing Read() to call RemoveResource on any list that had no items configured. Fix: fetchItems now always returns a non-nil slice on success (empty slice = list exists with 0 items, nil = 404/deleted). Also: - Move file-level godoc comment above package declaration - Correct O(n²) complexity claim to O(n × cost-per-element) - readBody now defers Body.Close() — intentional improvement over generator
cloudflare#1 computeItemsHash: null and empty string now hash differently. Previously types.String.ValueString() collapsed both to "", silently suppressing a diff when a user changed description from null to "". Now encoded with explicit type tags: null -> "null", string -> "s:<value>". cloudflare#2 computeItemsHash: nil item now encodes as "nil" sentinel instead of "" (zero value), preventing collision with a real item whose value and description are both empty. cloudflare#3 gatewayItemDescription: also check IsMissing() so absent JSON fields are stored as StringNull() rather than StringValue(""), avoiding perpetual plan diffs for configs that omit description. cloudflare#4 readBody moved back to resource.go where it is used. Removed dead code and unjustified io/net/http imports from items.go. cloudflare#5 suppressItemsDiff: replace resp.Plan.Set(ctx, plan) with resp.Plan.SetAttribute(ctx, path.Root("items"), ...) to surgically update only the items attribute without clobbering unknown computed values in other plan fields. Update hash_test.go to reflect the corrected semantics: null != empty string, nil item != empty-value item.
…nation assertion cloudflare#1 suppressItemsDiff: bail out before hashing if any plan item contains an unknown Value or Description. Previously, unknown fields collapsed to "" or "null" via ValueString(), which could match state and suppress a real diff when items reference computed outputs from other resources. cloudflare#2 gatewayItemDescription: remove dead IsMissing() check. IsNull() already covers missing fields (status <= null), so the extra check was unreachable and could mislead future readers about field semantics. cloudflare#3 fetchItems: assert len(page.Result) <= 1 and return an error if violated, rather than silently dropping page.Result[1:] and corrupting state.
…nostics cloudflare#1 computeItemsHash: null/unknown/empty now all have distinct type tags. Previously unknown collapsed to empty string (value) or "null" (desc), enabling false-positive diff suppression when items reference computed outputs. Encoding is now v:null/v:unknown/v(n):<val> and d:null/d:unknown/d(n):<val>. cloudflare#2 suppressItemsDiff: use local diag.Diagnostics for Plan.Get/State.Get so deserialization failures in this best-effort hook do not surface as provider errors in the plan response. cloudflare#3 suppressItemsDiff: add !IsKnown() guard so the function exits early when the plan or state is in an unknown state (resource ID not yet computed). cloudflare#4 fetchItems: check page.JSON.Result.IsNull() and return an error when the API returns a null result field rather than treating it as 0 items. Update hash_test.go: fix concatenation collision test to use new encoding format; add unknown vs null tests for value and description.
…gination guard - A null page.JSON.Result now correctly returns an empty items slice (list exists with 0 items) instead of an error. The Cloudflare API may return null for result on an empty list, which is valid. - Removed the len(page.Result) > 1 guard. SinglePage[[]GatewayItem].Result is [][]GatewayItem where the outer slice reflects the API response structure, not pagination pages. The guard was semantically wrong and would false-error on valid responses. The inner loop now directly iterates page.Result[0] guarded by the null + length check.
…-element guard cloudflare#1 fetchItems: iterate all outer Result batches instead of only Result[0]. Silent data loss if API ever returns more than one outer element. cloudflare#2 suppressItemsDiff: treat nil Items pointer as equivalent to empty slice so configs with items=[] and state with nil items are compared correctly. cloudflare#3 suppressItemsDiff: filter nil elements from stateItems before SetAttribute to avoid potential panic in the framework's slice reflection. cloudflare#4 fetchItems: remove redundant page.JSON.Result.IsNull() check — it is implied by len(page.Result) > 0 since a null field decodes to nil/empty.
…skipping in hash cloudflare#1 suppressItemsDiff: early-return when both plan.Items and state.Items are nil (items not configured at all), preventing the function from writing an empty items=[] into the plan and causing perpetual diffs for configs that omit the items block entirely. cloudflare#2 computeItemsHash: skip nil elements instead of encoding them as 'nil', matching the nil-stripping behavior in suppressItemsDiff. Previously the hash included nil tokens but the written plan value did not, creating a latent inconsistency in nil semantics between the two functions.
…Description apijson.Field has four statuses: missing(0), null(1), invalid(2), valid(3). IsNull() only covers status <= null, missing the invalid case where the field is present in the JSON response but failed to decode. When status == invalid, item.Description is the Go zero value "". Without this fix, an invalid field is stored in state as "" instead of null, causing a perpetual spurious diff that hash suppression cannot resolve.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
ZT Lists with 2000+ items were taking 70+ seconds to plan due to Terraform's expensive set comparison for nested objects. This optimization:
We have to use set for lists to make sure we don;t have duplicates. But set comparison is painfully slower than list comparison in terraform. The comparison is happening in the set internals library, same place as the performance bottleneck is observed. We do not want to make them list and introduce bugs. Instead we are suggesting a new modifyPlan with hash based comparison of the items.
Performance improvement: ~70,000x faster for no-change plans on large lists.
Trade-off: External drift detection on individual items is reduced, but Terraform config remains the source of truth.
Tests added. I can add an acceptance test ASAP