Ansible Playbook Performance Profiling and Optimization

  ### Background
  During some OOO digging, discovered that Ansible's built-in profiling callbacks provide valuable insights into playbook performance bottlenecks.

  ### Current Findings

  Using `ANSIBLE_CALLBACKS_ENABLED="profile_tasks,profile_roles"` on a single bootnode reveals both role level and task level time breakdowns

e.g:
  **Role Execution Times:**
  - lighthouse: 14.00s

  **Task-Level Breakdown:**
  - `ethpandaops.general.lighthouse : Run lighthouse container` - 10.71s
  - `ethpandaops.general.ethereum_node_fact_discovery : Get consensus node identity` - 9.84s


  ### Proposal

  1. **Basic Profiling Campaign**
     - Run profiling against 50+ nodes to establish statistical baseline (likely one of the latest devnets during first setup time)
     - Identify which tasks are consistently slow vs. variable
     - Distinguish between unfixable tasks (e.g., docker run commands) and optimizable ones

  2. **Enhanced Profiling with ansible-runner**
     ```bash
     ANSIBLE_CONFIG=./ansible.cfg ansible-runner run . -p playbook.yaml \
       --cmdline "--tags ethereum --limit bootnode" --artifact-dir ./artifacts
     This generates detailed artifacts with start/stop durations per task. We need a performance visualization tool that uses the files that are generated to showcase exactly the worst case tasks and roles. 

One or both approaches should give us enough info to figure out where we need to spend time optimising our ansible stack for 1k+ devnets.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ansible Playbook Performance Profiling and Optimization #409

Background

Current Findings

Proposal

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Ansible Playbook Performance Profiling and Optimization #409

Description

Background

Current Findings

Proposal

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions