openPMD plugin: Restart from checkpoint with non-matching domain decomposition#5508
Conversation
This reverts commit 17070daff52cd0d1ccc790ee0738e7ff9d99ef2c.
This reverts commit 2a67584769ab5d818d5fc0a1b205997e730931c7.
franzpoeschel
left a comment
There was a problem hiding this comment.
Some lines to pay special attention to for code reviewers
| * with, we must take care not to index past the dataset boundaries. Just loop around to the start | ||
| * in that case. Not the finest way, but it does the job for now.. | ||
| */ | ||
| start.push_back(gridPos.revert()[d] % extent[d]); |
There was a problem hiding this comment.
Is this really the best solution? This loads nextId and startId. If the GPU grid does not match the old one, this logic currently just uses modulus to loop around and hand out some random ID. No idea if that has downsides.
| for(size_t d = 0; d < simDim; ++d) | ||
| { | ||
| auto positionInD = positionVec[d] + positionOffsetVec[d]; | ||
| if(positionInD < patchTotalOffset[d] || positionInD >= patchUpperCorner[d]) |
There was a problem hiding this comment.
This line decides if a particle from a partially overlapping domain should be considered. I treat the lower boundary as inclusive and the upper boundary as exclusive, otherwise I had single particles loaded into both processes.
| DataSpace<simDim> const patchTotalOffset | ||
| = localToTotalDomainOffset + threadParams->localWindowToDomainOffset; | ||
| DataSpace<simDim> const patchExtent = threadParams->window.localDimensions.size; | ||
| DataSpace<simDim> const patchUpperCorner = patchTotalOffset + patchExtent; |
There was a problem hiding this comment.
Yeah someone please check if all these position computations match up. They seem to do in my tests.
| filterKeep, | ||
| filterRemove, | ||
| alpaka::getPtrNative(filter)); | ||
| eventSystem::getTransactionEvent().waitForFinished(); |
There was a problem hiding this comment.
Someone please verify the kernel and its call. I suspect that the call can be made more efficient.
| // patchExtent)) << | ||
| // '\n'; | ||
| if((patchTotalOffset <= offsets[i]) == true_ | ||
| && ((offsets[i] + extents[i]) <= (patchTotalOffset + patchExtent)) == true_) |
There was a problem hiding this comment.
Again, someone please verify the positioning logic in here. Note that both ends are inclusive here, since ideally (i.e. in the normal case) we compare a particle patch from disk against itself, i.e. the local window which is the same.
include/picongpu/plugins/openPMD/restart/RestartFieldLoader.hpp
Outdated
Show resolved
Hide resolved
|
Let me merge this, there is no need to delay this nice PR. I will soon add the correct particle id counter fix but this is also something which should not block the PR. |
00dbba5
into
ComputationalRadiationPhysics:dev
Factored out of #5405 as a standalone feature.
Different domain decomposition means a different distribution of parallel processes over the simulation domain. The number of parallel processes may vary in between runs.
TODO:
To reviewers, please also check the comments below