On Apr 23, 2007, at 9:22 PM, Mostyn Lewis wrote:
> I tried this on a humble PC and it works there.
> I see in the --mca mpi_show_mca_params 1 print out that there is a
> [bb17:06646] paffinity=
> entry, so I expect that sets the value entry back to 0?
There should be an mpi_paffinity_alone parameter; that's what drives
the whole process.
> I'll get to the SLES10 cluster when I can (other people doing
> benchmarks) and see what I can. I see there's no stdbool.h there,
> so maybe this is an artifact of defining the bool type on an
> operton. I'll get back to you when I can.
Lack of (bool) shouldn't be a factor. If it is, we have a bug.
> The test of boundness was a perl program invoked via system() in a
> C MPI program. The /proc/<pid>/stat result shows the CPU you are
> bound to (3rd number from the end) and a taskset call gets back the
> mask to show if you are bound or not.
Hmm. What version kernel do you have? I know there were some issues
with this information until recent versions (I confess to not knowing
which version the information became stable/reliable, unfortunately).
Are you launching under a scheduler, perchance? N1GE may be setting
affinity before MPI processes are even launched, for example...?
(I'm not too familiar with N1GE -- I'm speculating).
There's a simple acid test to see if OMPI is setting the affinity or
not: remove the linux paffinity component (assuming you compiled the
components as plugins/dynamic shared objects). Go to the OMPI
There should be 2 files in there named mca_paffinity_linux.*. This
is the component that knows how to set processor affinity in Open
MPI; if it's not there, Open MPI won't know how to set affinity on
your system (and therefore won't). Rename or move these files so
that they are not findable, such as:
mv *paffinity_linux* junk
And run your test again. If you're still getting affinity set, then
it's not Open MPI that is setting it.