Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Job distribution on many-core NUMA system
From: A. Austen (metallurgist_at_[hidden])
Date: 2009-08-28 15:36:40

On Fri, 28 Aug 2009 10:16 -0700, "Eugene Loh" <Eugene.Loh_at_[hidden]>
> Big topic and actually the subject of much recent discussion. Here are
> a few comments:
> 1) "Optimally" depends on what you're doing. A big issue is making
> sure each MPI process gets as much memory bandwidth (and cache and other
> shared resources) as possible. This would argue that processes
> *should* be spread over as many sockets as possible. And, indeed, some
> MPIs default to this behavior. It depends on lots of things, including
> how much of the machine you're using.

Yes, you're right. In my case, my processes within a single MPI job are
tightly coupled. These jobs are communication-intensive, and if I want
to use as many of the processors as possible, then minimizing the
cross-processor communication should yield the best overall throughput.
However, I see your point completely -- for an embarassingly parallel
problem, spreading the processes amongst the different sockets/memory
pools would probably give the best performance.
> 2) Currently (1.3.2), there is rankfile support. This is probably a
> little bit more gruesome than you hope for. E.g., if you have multiple
> jobs, you need to custom tailor the rankfile for each.

So then it would seem like at least for now, I can get the behavior I
want by using rankfiles?

Also, if I use the rankfile to distribute the processes, how about the
affinity issue? Can I still use affinity and expect that it will apply
to the topology specified in the rankfile, or will all the MPI jobs
always try to bind to the same processors in sequence?

-- - The professional email service