bug: Tasks with thousands of related nodes get stuck in RUNNING when Prefect rejects events for exceeding PREFECT_SERVER_EVENTS_MAXIMUM_RELATED_RESOURCES (default 500)

### Component

API Server / GraphQL

### Infrahub version

1.9 (develop)

### Current Behavior

Prefect events carry a list of related resources (one entry per related node passed to a flow / emitted from a flow run). Prefect's server enforces an upper bound on this list through `PREFECT_SERVER_EVENTS_MAXIMUM_RELATED_RESOURCES`, which defaults to **500**.

When an Infrahub flow runs over thousands of related nodes (e.g. a computed-attribute pipeline touching every instance of a kind), the terminal event emitted for the flow run exceeds this cap. The Prefect server raises a validation error while persisting the event, the flow run never receives a terminal state transition, and the task is stuck in `RUNNING` indefinitely from Infrahub's perspective.

The issue is purely the size of the related-resources list on the emitted event — the underlying work has finished successfully.

### Expected Behavior

Tasks with arbitrarily large related-node sets must reach a terminal state (`COMPLETED` / `FAILED`) reliably. Options to consider:

1. **Cap or chunk related resources at emit time.** Truncate the related-resources list to a safe size before the event is sent, or split into multiple events.
2. **Avoid attaching every related node to flow-run events.** Move the per-node identity off events (e.g. into task logs or a dedicated relationship in the Infrahub graph) and only keep aggregate / summary related resources on the Prefect event.
3. **Raise the Prefect limit.** Document and ship a higher `PREFECT_SERVER_EVENTS_MAXIMUM_RELATED_RESOURCES` value with the Prefect server config — but this only delays the failure, so it should be paired with (1) or (2).

### Steps to Reproduce

1. Schema with a kind that has more than 500 instances.
2. Trigger a workflow that operates over all instances and reports them as related nodes on the resulting task (e.g. a Python transform computed attribute, see #9025).
3. Observe in the UI / `infrahubctl task list`: the task remains in `RUNNING`.
4. Inspect Prefect server logs: validation error rejecting the event because related resources exceed `PREFECT_SERVER_EVENTS_MAXIMUM_RELATED_RESOURCES` (500). No terminal state is recorded for the flow run.

### Additional Information

- Blocking PR opsmill/infrahub#9025 (transform-based computed attribute performance improvements). The PR makes large-batch runs realistic and consistently exposes this failure mode.
- Default Prefect setting: [`PREFECT_SERVER_EVENTS_MAXIMUM_RELATED_RESOURCES = 500`](https://docs.prefect.io/v3/api-ref/settings-ref#related-resources).
- Companion frontend issue: tasks with thousands of related nodes also crash the browser due to per-node display-label queries (filed separately).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bug: Tasks with thousands of related nodes get stuck in RUNNING when Prefect rejects events for exceeding PREFECT_SERVER_EVENTS_MAXIMUM_RELATED_RESOURCES (default 500) #9068

Component

Infrahub version

Current Behavior

Expected Behavior

Steps to Reproduce

Additional Information

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

bug: Tasks with thousands of related nodes get stuck in RUNNING when Prefect rejects events for exceeding PREFECT_SERVER_EVENTS_MAXIMUM_RELATED_RESOURCES (default 500) #9068

Description

Component

Infrahub version

Current Behavior

Expected Behavior

Steps to Reproduce

Additional Information

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions