Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Bad Infiniband latency with subounce
From: Ralph Castain (rhc_at_[hidden])
Date: 2010-02-15 23:21:13

On Feb 15, 2010, at 8:44 PM, Terry Frankcombe wrote:

> On Mon, 2010-02-15 at 20:18 -0700, Ralph Castain wrote:
>> Did you run it with -mca mpi_paffinity_alone 1? Given this is 1.4.1, you can set the bindings to -bind-to-socket or -bind-to-core. Either will give you improved performance.
>> IIRC, MVAPICH defaults to -bind-to-socket. OMPI defaults to no binding.
> Is this sensible? Won't most users want processes bound? OMPI's
> supposed to "to the right thing" out of the box, right?

Well, that depends on how you look at it. Been the subject of a lot of debate within the devel community. If you bind by default and it is a shared node cluster, then you can really mess people up. On the other hand, if you don't bind by default, then people that run benchmarks without looking at the options can get bad numbers. Unfortunately, there is no automated way to tell if the cluster is configured for shared use or dedicated nodes.

I honestly don't know that "most users want processes bound". One installation I was at set binding by default using the system mca param file, and got yelled at by a group of users that had threaded apps - and most definitely did -not- want their processes bound. After a while, it became clear that nothing we could do would make everyone happy :-/

I doubt there is a right/wrong answer - at least, we sure can't find one. So we don't bind by default so we "do no harm", and put out FAQs, man pages, mpirun option help messages, etc. that explain the situation and tell you when/how to bind.

> _______________________________________________
> users mailing list
> users_at_[hidden]