Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] some questions regarding the portals modules
From: Matney Sr, Kenneth D. (matneykdsr_at_[hidden])
Date: 2010-07-09 13:16:24


Hello Jerome,

The first one is simple. portals is not thead-safe on the Cray XT. As, I recall,
only the master thread can post an event. although any thread can receive
the event. Although, i might have it backwards. It has been a couple of years
since I played with this.

The second one depends on how you use your Cray XT. In our case, the machine
is used as process-per-core; i.e., not as a collection of SMPs. For performance
reasons, you definitely do not want MPI threads. Also, since it is run process-per-core,
there is nothing to be gained with progress threads. Portals events will generate a kernel
level interrupt. Whether you can run the XT as a cluster of SMPs is another question
entirely. We really have not tried this in the context of OMPI. But, in conjunction with
portals, this might open a "can of worms". For example, any thread can be run on any
core. But the portals ID for a thread will be the NID/PID pair for that core. If two threads
get scheduled to the same core, it would not be pretty.

I could see lots of reasons why spawn might fail. First, it is run on a compute node.
There is no way for a compute node to run a process on another compute node.
Also, there will be no rank/size initialization forthcoming from ALPS. So, even if
it got past this, it would be running on the same node as its parent.
-- Ken Matney, Sr.
   Oak Ridge National Laboratory

On Jul 9, 2010, at 7:53 AM, Jerome Soumagne wrote:

Hi,

As I said in the previous e-mail, we've recently installed OpenMPI on a Cray XT5 machine, and we therefore use the portals and the alps libraries. Thanks for providing the configuration script from Jaguar, this was very helpful, it had just to be slightly adapted in order to use the latest CNL version installed on this machine.

I have some questions though regarding the use of the portals btl and mtl components. I noticed that when I compiled OpenMPI with mpi-thread support enabled and ran a job, the portals components did not want to initialize due to these funny lines:

./mtl_portals_component.c
182 /* we don't run with no stinkin' threads */
183 if (enable_progress_threads || enable_mpi_threads) return NULL;

I'd like to know why are mpi threads disabled since threads are supported on the XT5, does the btl/mtl require to have thread-safety implemented or something like that or is it because of the portals library itself ?

I would also like to use the MPI_Comm_accept/connect functions, it seems that it's not possible to do that using the portals mtl even if the spawn seems to be supported, did I do something wrong or is it really not supported?
In this case, is it possible to extend this module to support these functions? We could help in doing that.

I'd like also to know, are there any plans for creating a module in order to use the DMAPP interface for the Gemini interconnect?

Thanks.

Jerome

--
Jérôme Soumagne
Scientific Computing Research Group
CSCS, Swiss National Supercomputing Centre
Galleria 2, Via Cantonale  | Tel: +41 (0)91 610 8258
CH-6928 Manno, Switzerland | Fax: +41 (0)91 610 8282
<ATT00001..txt>