Skip to content

Restart() returns early on non-reader ranks and can skip required post-read redistribution #5059

@WeiqunZhang

Description

@WeiqunZhang
  • Type: Correctness / Parallel synchronization
  • Severity: High
  • Component: Particle restart path (level-loss handling)
  • Location:
    • Src/Particle/AMReX_ParticleIO.H:855
    • Src/Particle/AMReX_ParticleIO.H:939

Problem

In the lev > finestLevel() branch, ranks with rank >= NReaders execute:

if (rank >= NReaders) { return; }

This exits Restart() immediately from inside the per-level loop. Those ranks skip the rest of the restart flow, including final Redistribute() and consistency checks.

Impact

  • Non-reader ranks can exit restart before the collective redistribution phase.
  • In large runs with particles.nreaders < NProcs, this can cause inconsistent post-restart state and potential hangs when reader ranks enter MPI-heavy redistribution.

Suggested patch

Skip file reads for non-reader ranks, but do not leave Restart() early.

--- a/Src/Particle/AMReX_ParticleIO.H
+++ b/Src/Particle/AMReX_ParticleIO.H
@@
-            if (rank >= NReaders) { return; }
+            if (rank >= NReaders) { continue; }

Prepared by Codex

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions