NUT builds often complete the bulk of work in a couple of hours, and then a "tail" of builds that can be uniquely satisfied by one or two nodes is processed, so the jobs' wallclock time from start to end often ranges between 5-8 hours.
This issue suggests that the dynamatrix parallel-stage generator identifies how many of the currently available build agents match this or that label expression, and adds the (more-)unique stages earlier into the Map for subsequent parallel queuing and execution (assuming that the HashMap would be then read in order of addition by the parallel step implementation, the best effort we can do in this library). Such ordering would allow specific nodes to tend to stages which only they can do first, and then be available to pick from the queue any less-constrained requirements which they are just "one of many" agents capable to complete.