Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] General question on the implementation of a"scheduler" on client side...
From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2010-05-20 11:54:14

You're basically talking about implementing some kind of application-specific protocol. A few tips that may help in your design:

1. Look into MPI_Isend / MPI_Irecv for non-blocking sends and receives. These may be particularly useful in the server side, so that it can do other stuff while sends and receives are progressing.

2. You probably already noticed that collective operations (broadcasts and the link) need to be invoked by all members of the communicator. So if you want to do a broadcast, everyone needs to know. That being said, you can send a short message to everyone alerting them that a longer broadcast is coming -- then they can execute MPI_BCAST, etc. That works best if your broadcasts are large messages (i.e., you benefit from scalable implementations of broadcast) -- otherwise you're individually sending short messages followed by a short broadcast. There might not be much of a "win" there.

3. FWIW, the MPI Forum has introduced the concept of non-blocking collective operations for the upcoming MPI-3 spec. These may help; google for libnbc for a (non-optimized) implementation that may be of help to you. MPI implementations (like Open MPI) will feature non-blocking collectives someday in the future.

On May 20, 2010, at 5:30 AM, Olivier Riff wrote:

> Hello,
> I have a general question about the best way to implement an openmpi application, i.e the design of the application.
> A machine (I call it the "server") should send to a cluster containing a lot of processors (the "clients") regularly task to do (byte buffers from very various size).
> The server should send to each client a different buffer, then wait for each client answers (buffer sent by each client after some processing), and retrieve the result data.
> First I made something looking like this.
> On the server side: Send sequentially to each client buffers using MPI_Send.
> On each client side: loop which waits a buffer using MPI_Recv, then process the buffer and sends the result using MPI_Send
> This is really not efficient because a lot of time is lost due to the fact that the server sends and receives sequentially the buffers.
> It only has the advantage to have on the client size a pretty easy scheduler:
> Wait for buffer (MPI_Recv) -> Analyse it -> Send result (MPI_Send)
> My wish is to mix MPI_Send/MPI_Recv and other mpi functions like MPI_BCast/MPI_Scatter/MPI_Gather... (like I imagine every mpi application does).
> The problem is that I cannot find a easy solution in order that each client knows which kind of mpi function is currently called by the server. If the server calls MPI_BCast the client should do the same. Sending at each time a first message to indicate the function the server will call next does not look very nice. Though I do not see an easy/best way to implement an "adaptative" scheduler on the client side.
> Any tip, advice, help would be appreciate.
> Thanks,
> Olivier
> _______________________________________________
> users mailing list
> users_at_[hidden]

Jeff Squyres
For corporate legal information go to: