Defer slicing for compute-on-fetch writer outputs#5502
Defer slicing for compute-on-fetch writer outputs#5502taimoorsohail wants to merge 8 commits intomainfrom
Conversation
|
@simone-silvestri did I just repeat work you already did in #5492 ?? |
|
Nono, this is an independent bug with output writers, good catch |
|
|
||
| function fetch_output(output::DeferredSlicedOutput, model) | ||
| full_output = fetch_output(output.source_output, model) | ||
| return view(full_output, output.write_indices...) |
There was a problem hiding this comment.
I think the fetch_output for AbstractFields return plain arrays, while this seems to return a Field?
There was a problem hiding this comment.
This computes a view on the result of fetch_output. So if fetch_output returns an Array, then view(full_output, ...) is a SubArray.
There was a problem hiding this comment.
This returns something Array-like, which I think is OK?
Co-authored-by: Simone Silvestri <silvestri.simone0@gmail.com>
| return nothing | ||
| end | ||
|
|
||
| function test_jld2_with_halos_false_compute_outputs(arch) |
There was a problem hiding this comment.
these tests pass also on main, I am not sure what gets exercised here
|
I might be confused about the description, but we of course do not want to compute 3D and then only output a small 2D slice? Can this sentence be clarified:
|
| additional_kw = user_output isa Field ? NamedTuple() : (; compute=false) | ||
|
|
||
| return Field(user_output; indices, additional_kw...) | ||
| if !with_halos && requires_compute_on_fetch(user_output) |
There was a problem hiding this comment.
I'm struggling to grok what "compute on fetch" means. Can we try to rethink the semantics? I am admittedly pretty dense 😂
There was a problem hiding this comment.
what about needs_halo_compute_for_write? Is that better framing?
There was a problem hiding this comment.
what are the cases that require halo slicing but do not require the second condition?
There was a problem hiding this comment.
Is the condition being sought iscomputed? In that case, it would be a test for non-Nothing Field.operand and could belong in Fields module.
| struct DeferredSlicedOutput{SO, I} | ||
| source_output :: SO | ||
| write_indices :: I | ||
| end |
There was a problem hiding this comment.
If the purpose of this is only to slice of halo regions, should we call it OutputWithoutHalos or something?
There was a problem hiding this comment.
how about ComputeThenSliceOutput? Because that is what it is doing...?
There was a problem hiding this comment.
Or ComputeWithHalosThenSliceOutput to be more clear
There was a problem hiding this comment.
Ah I see. Maybe ComputedHalosWithoutHalos.
This represents output that must be compute! ?
There was a problem hiding this comment.
Is there any case where we would compute without halos? This part is causing the confusion for me
| return Field(user_output; indices, additional_kw...) | ||
| if !with_halos && requires_compute_on_fetch(user_output) | ||
| source_indices = output_indices(user_output, user_indices, true) | ||
| source_output = Field(user_output; indices=source_indices, additional_kw...) |
There was a problem hiding this comment.
this answers my earlier concern since the Field is indeed constructed with a non-trivial indices
I think the answer is that this description is a little misleading. Slicing is not completely deferred. We slice TWICE: during output construction, and then further for output. Is that right? |
Yes, but during output construction, we slice with halos (as the halo information is required for compute) and then for output writing we remove the halos before saving. |
Co-authored-by: Simone Silvestri <silvestri.simone0@gmail.com>
|
OK, I removed one of the tests which was indeed not accomplishing anything. The other one is good though. |
|
I had a chat with @simone-silvestri and I think this is good to go? |
|
Is this only for distributed? It would be nice to see a test that fails on main (serial) but passes on this PR? |
|
Yes, this is only an issue for distributed, where the domain is partitioned. I believe that |
|
Sorry to come late in the chat, but don't we anyway always want to slice off the halos only at the very end, just before writing to disk? Why is this only a distributed-related thing? |
|
@navidcy You are right -- I think that the bug is exposed only in distributed, for some reason it works in serial (i.e. tests pass) but is still a bug... |
Summary
This PR fixes a distributed output-writer bug for
with_halos=false.For diagnostics evaluated when output is written (derived fields, reductions, and time averages), interior slicing could be applied too early. In distributed runs, that can remove halo-needed indices from the compute path and trigger halo/bounds/MPI failures.
Minimal Reproducer (before fix)
Failure mode: construction/fetch path can use overly-sliced indices for the compute step, causing distributed halo communication failure.
What Changed (by file)
src/OutputWriters/output_construction.jlwith_halos=falsein those cases, construction now creates:src/OutputWriters/fetch_output.jlDeferredSlicedOutputhandling.write_indices) before conversion/write.This preserves correct distributed compute behavior while still writing interior-only data for
with_halos=false.test/test_jld2_writer.jlwith_halos=falsefor write-time evaluated diagnostics under:TimeIntervalIterationIntervalAveragedTimeIntervaltest/test_mpi_tripolar.jlwith_halos=false.Behavioral Clarification
This does not compute a full 3D domain and then keep a tiny 2D slice.
It computes over the requested region with halo-capable indices (for safe communication), then applies interior slicing only for what is written.
Compatibility
with_halos=falseoutput shape semantics remain interior-only.with_halos=truebehavior unchanged.