Skip to content

profiling: run cylc 7 profile tests against cylc 8 #141

@oliver-sanders

Description

@oliver-sanders

Cylc 7 had an on board automated profile battery which provided us with insights into Cylc's scaling characteristics.

This profile battery will need a total re-write for Cylc 8, there is an issue for that here #38, however, this is a relatively large job and one for which we don't have time at the moment so I suggest manually running some of the old scaling tests to give us an idea of how scaling has changed and how Cylc 8 compares againt Cylc 7.

Quick overview of the main scaling dimensions of Cylc:

  • Tasks
  • Edges
  • Recurrences (e.g. T00, T01, T02, etc, raises CPU load)
  • Churn / throughput (number of task events per second)

These dimensions are targeted by suites which were located in etc/dev-suites in the Cylc 7 source code.

For MO people some (rather dated) results from the old automated profile-battery can be found here: https://www-nwp/~osanders/profiling-results/2016/

To test scaling you will need to run the same experiment with different template variables.

Notes:

  • I suggest ramping up the scaling factors slowly, you can easily take down your machine with profiling experiments.
  • The chicken switch is killall python!
  • If you keep adding parallel tasks (without an internal Cylc queue), then you will eventually hit your machines fork limit which will cause task failures (and a very unhappy box).

To profile them I suggest using /usr/bin/time as we have a good level of understanding of this command and results will be comparable to prior tests.

Notes:

  • time and /usr/bin/time may differ in your shell.
  • Always use verbose mode.
  • For Darwin install GNU time.
$ /usr/bin/time -o <output-file> -v cylc play <workflow> -s <option> ... --no-detach

In the output we are interested in:

  • Wallclock time.
  • User + System time (i.e. CPU usage)
  • RSS memory usage.

For fair tests you need to run all experiments on the same hardware and to ensure the machine is idle. Running on a cluster may be a good idea but you may need to claim the whole node to meet these conditions (with the obvious resource wastage implications).

We don't have any up-to-date results for Cylc 7 so you will need to profile that too, however, you can use the automated profile battery there if you like. I think you run it like so:

$ cylc profile-battery -e complex -v 7.x.x

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions