Skip to content

Commit 807fb80

Browse files
Feature:4135 Add docs section for step heartbeat
1 parent e872336 commit 807fb80

File tree

1 file changed

+34
-0
lines changed

1 file changed

+34
-0
lines changed

docs/book/how-to/steps-pipelines/advanced_features.md

Lines changed: 34 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -147,6 +147,40 @@ If steps 2, 3, and 4 execute in parallel and step 2 fails:
147147
All three execution modes are currently only supported by the `local`, `local_docker`, and `kubernetes` orchestrator flavors. For any other orchestrator flavor, the default (and only available) behavior is `CONTINUE_ON_FAILURE`. If you would like to see any of the other orchestrators extended to support the other execution modes, reach out to us in [Slack](https://zenml.io/slack-invite).
148148
{% endhint %}
149149

150+
### Step Heartbeat
151+
152+
Step heartbeat is a background mechanism that runs alongside step executions and performs two core functions:
153+
154+
- Periodically pings the ZenML server to refresh the step's heartbeat value.
155+
- Retrieves the current pipeline and step status, and terminates the step if the pipeline has entered a stopping state.
156+
157+
This enables ZenML to:
158+
159+
- Track the liveness of a step execution and assess its health based on incoming heartbeats.
160+
- Gracefully interrupt running steps when a pipeline is being stopped.
161+
162+
*Scope and current behavior*
163+
164+
- Heartbeats are enabled only for steps executed in isolated environments. This excludes:
165+
- `Inline` steps in `dynamic` pipelines.
166+
- Steps run via the `local` orchestrator.
167+
- A step that becomes unhealthy automatically triggers a graceful shutdown (currently supported for the `kubernetes` orchestrator).
168+
- When using `CONTINUE_ON_FAILURE` execution mode, heartbeat status is also used to decide whether execution tokens should be invalidated.
169+
170+
*Configuration*
171+
172+
You can configure how long a step may go without sending a heartbeat before it is considered unhealthy using the `heartbeat_healthy_threshold` step parameter.
173+
The default value currently applied is the system's maximum allowed value (30 minutes).
174+
175+
```python
176+
from zenml import step
177+
178+
@step(heartbeat_healthy_threshold=30)
179+
def my_step():
180+
...
181+
182+
```
183+
150184
## Data & Output Management
151185

152186
## Type annotations

0 commit comments

Comments
 (0)