On Jun 29, 2006, at 5:23 PM, Tom Rosmond wrote:
> I am testing the one-sided message passing (mpi_put, mpi_get) that
> is now supported in the 1.1 release. It seems to work OK for some
> simple test codes, but when I run my big application, it fails.
> This application is a large weather model that runs operationally
> on the SGI Origin 3000, using the native one-sided message passing
> that has been supported on that system for many years. At least on
> that architecture, the code always runs correctly for processor
> numbers up to 480. On the O3K a requirement for the one-sided
> communication to work correctly is to use 'mpi_win_create' to
> define the RMA 'windows' in symmetric locations on all processors,
> i.e. the same 'place' in memory on each processor. This can be
> done with static memory, i.e. , in common; or on the 'symmetric
> heap', which is defined via environment variables. In my
> application the latter method is used. I define several of these
> 'windows' on the symmetric heap, each with a unique handle.
> Before I spend my time trying to diagnose this problem further, I
> need as much information about the OpenMPI one-sided implementation
> as available. Do you have a similar requirement or criteria for
> symmetric memory for the RMA windows? Are there runtime parameters
> that I should be using that are unique to one-sided message passing
> with OpenMPI? Any other information will certainly be appreciated.
There are no requirements on the one-sided windows in terms of buffer
pointers. Our current implementation is over point-to-point so it's
kinda slow compared to real one-sided implementations, but has the
advantage of working with arbitrary window locations.
There is only two parameters to tweak in the current implementation:
osc_pt2pt_eager_send: If this is 1, we try to start progressing
before the synchronization point. The default is 0. This is
tested, so I recommend leaving it 0. It's safer at this point.
osc_pt2pt_fence_sync_method: This one might be worth playing with,
doubt it could cause your problems. This is the collective we
implement MPI_FENCE. Options are reduce_scatter (default),
alltoall. Again, I doubt it will make any difference, but
interesting to confirm that.
You can set the parameters at mpirun time:
mpirun -np XX -mca osc_pt2pt_fence_sync_method reduce_scatter ./
Our one-sided implementation has not been as well tested as the rest
of the code (as this is our first release with one-sided support).
If you can share any details on your application or, better yet, a
test case, we'd appreciate it.
There is one known issue with the implementation. It does not
support using MPI_ACCUMULATE with user-defined datatypes, even if
they are entirely composed of one predefined datatype. We plan on
fixing this in the near future, and an error message will be printed
if this situation occurs.
Open MPI developer