Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

From: Mostyn Lewis (Mostyn.Lewis_at_[hidden])
Date: 2007-04-24 17:28:01


Well, I'm sorry to have caused even a smidgen of grief here.
I moved aside the *paffinity_linux* module and la and it still
bound. I was using InfiniPath HCAs and beta software and eventually found
(sigh) a variable to stop the affine - IPATH_NO_CPUAFFINITY.

So, a

export IPATH_NO_CPUAFFINITY=1
$OPENMPI_GCC/bin/mpirun -x IPATH_NO_CPUAFFINITY -np 1 -host s0158 ./cpi

showed me what I wanted to see:

18236:cpi *->0 (f=noaffinity,0,1,2,3)

This, in the jargon of my utility, says the mask for taskset is 0xf
and so is not affined and the ->0 says it's on CPU 0.

The reason all this comes about is I do endless benchmarks for my
employer and get to use Scali, QuickSilver(SilverStorm), Qlogic(InfiniPath),
all the ethernet MPICHes and LAMs (fading fast) - even HP MPI - on
our racks which have x cores / socket and sometimes we like to use
our own methodoligies to choose where to bind and in that case need to
switch off any supplied binding. I really wish the default was no
binding like OpenMPI with docs that point out the variables but it's
not always the case.

Sorry again for any trub,
Mostyn

On Tue, 24 Apr 2007, Jeff Squyres wrote:

> On Apr 23, 2007, at 9:22 PM, Mostyn Lewis wrote:
>
>> I tried this on a humble PC and it works there.
>> I see in the --mca mpi_show_mca_params 1 print out that there is a
>> [bb17:06646] paffinity=
>> entry, so I expect that sets the value entry back to 0?
>
> There should be an mpi_paffinity_alone parameter; that's what drives
> the whole process.
>
>> I'll get to the SLES10 cluster when I can (other people doing
>> benchmarks) and see what I can. I see there's no stdbool.h there,
>> so maybe this is an artifact of defining the bool type on an
>> operton. I'll get back to you when I can.
>
> Lack of (bool) shouldn't be a factor. If it is, we have a bug.
>
>> The test of boundness was a perl program invoked via system() in a
>> C MPI program. The /proc/<pid>/stat result shows the CPU you are
>> bound to (3rd number from the end) and a taskset call gets back the
>> mask to show if you are bound or not.
>
> Hmm. What version kernel do you have? I know there were some issues
> with this information until recent versions (I confess to not knowing
> which version the information became stable/reliable, unfortunately).
>
> Are you launching under a scheduler, perchance? N1GE may be setting
> affinity before MPI processes are even launched, for example...?
> (I'm not too familiar with N1GE -- I'm speculating).
>
> There's a simple acid test to see if OMPI is setting the affinity or
> not: remove the linux paffinity component (assuming you compiled the
> components as plugins/dynamic shared objects). Go to the OMPI
> installation directory:
>
> $prefix/lib/openmpi
>
> There should be 2 files in there named mca_paffinity_linux.*. This
> is the component that knows how to set processor affinity in Open
> MPI; if it's not there, Open MPI won't know how to set affinity on
> your system (and therefore won't). Rename or move these files so
> that they are not findable, such as:
>
> cd $prefix/lib/openmpi
> mkdir junk
> mv *paffinity_linux* junk
>
> And run your test again. If you're still getting affinity set, then
> it's not Open MPI that is setting it.
>
> --
> Jeff Squyres
> Cisco Systems
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>