Skip to content

Use a supplied communicator group and not MPI_COMM_WORLD #237

@ss421

Description

@ss421

While testing malak with XIOS in server mode, I encounter an error towards the end of the application that appear to be related to the VarObs and CX Writer. The error is not very helpful so I tested without the writer filters and the application runs. I suspect this is to do with the communicator group because when running in server mode, the jedi application does not use the MPI_COMM_WORLD comm, instead it uses a split communicator where some PEs are used by the server.

I did a search for MPI_COMM_WORLD and found a few references, the following in particular:

/// \note This filter must only be used with ObsSpaces using the \c MPI_COMM_WORLD communicator,
/// otherwise a deadlock will occur while writing the VarObs file. This is due to a limitation of
/// the \c Ops_WriteVarobs function, which could be removed by replacing \c mpl_comm_world in the
/// call to \c Ops_Mpl_Gatherv by \c mpi_group (for consistency with all other MPI calls in \c
/// Ops_WriteVarobs).

see: https://github.com/MetOffice/opsinputs/blob/2720b0b5d3ec2a129e27475d0fc6911547b4de17/src/opsinputs/VarObsWriter.h#L46C1-L50C22

suggest that there would be some additional work that is required if we want XIOS server + varWriter. I don't imagine that this is too much work. Adding a few people that may know:

Wojciech Śmigaj (@wsmigaj) David Simonin (@DavidSimonin) Chris Thomas (@ctgh) Michael Cooke (@mikecooke77) Adam Maycock (@adammaycock) David Davies (@DJDavies2)

if others have an ideas then please comment below.

Thanks, Steve.

Metadata

Metadata

Labels

enhancementNew feature or request

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions