Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: [OMPI devel] Revise paffinity method?
From: Ralph Castain (rhc_at_[hidden])
Date: 2009-05-06 10:27:03


We have used several different methods for binding processes over the
years. Our current method is based on having the orted pass
environmental parameters to the individual processes when it fork/
exec's them, and then having each process bind itself to the specified
processor(s) during MPI_Init via a call to the OPAL paffinity framework.

It was noted earlier this week on the user mailing list that not
binding the process until it calls MPI_Init has some disadvantages - I
have added some that we have previously discussed:

1. the process is free to move around -until- it calls MPI_Init, thus
possibly conflicting with other processes on the node that have
already called MPI_Init and been bound.

2. memory allocated by the process prior to calling MPI_Init may not
be local to the eventual processor the process is bound to, thus
hurting performance

3. while we support non-MPI applications, our current method will not
bind them. This was actually one of the problems that motivated the
user list discussion as the user was testing with "hostname" and
failing to see it bound. While we can argue that we are Open -MPI-,
there is a little issue here with "user surprise".

4. from the user mailing list, it was clear that some users at least
expected the process to be bound from start of execution. Eugene did
note on one such discussion that he has seen similar behavior (i.e.,
not bound until MPI_Init) on other MPI implementations, but I think
fairly questioned whether or not this was the right way to go.

I am sure others can think of more issues - this isn't meant to be an
exhaustive list.

I should note that we never see these problems in our tests because
they always call MPI_Init right away at the beginning of the program.
I admit that many of our local applications do the same - however,
many of them also do setup memory regions prior to calling MPI_Init,
which does reflect Eugene's use-case.

Any thoughts on this? Should we change it?

If so, who wants to be involved in the re-design? I'm pretty sure it
would require some modification of the paffinity framework, plus some
minor mods to the odls framework and (since you cannot bind a process
other than yourself) addition of a new small "proxy" script that would
bind-then-exec each process started by the orted (Eugene posted a
candidate on the user list, though we will have to deal with some
system-specific issues in it).

Ralph