Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] sm BTL flow management
From: Ralph Castain (rhc_at_[hidden])
Date: 2009-06-25 07:58:11

Unfortunately, we cannot access this - permissions are denied. In
poking around, I found that your hg directory has permission 700.

Afraid you'll have to grant us permission to access this. :-/


On Jun 25, 2009, at 1:06 AM, Eugene Loh wrote:

> Bryan Lally wrote:
>> Ralph Castain wrote:
>>> Be happy to put it through the wringer... :-)
>> My wringer is available, too.
> 'kay. Try
> hg clone ssh://
> which is r21498 but with changes to poll one's own FIFO more
> regularly (e.g., even when just performing sends) and to retry
> pending sends more aggressively (e.g., whenever about to try a send
> or whenever one calls sm progress). I maintain a count of
> outstanding fragments (sent but not yet returned to free list) and
> of pending sends (total over all queues) to keep overheads down.
> My various test codes (repeated Bcasts, half-duplex point-to-point
> sends, etc.) all pass now. There is no perceptible degradation in 0-
> byte pingpong latency that I can tell. George's fixed-free-list
> proposal may be better, but I'm making these bits available for some
> soak and feedback.
> Life is still not perfect. If you look in
> mca_btl_sm_component_progress, when a process receives a message
> fragment and returns it to the sender, it executes code like this:
> goto recheck_peer;
> break;
> Okay, the reason I show you that code is because a static code
> checker should easily identify the break statement as dead code.
> It'll never be reached. Anyhow, in English, what's happening is if
> you receive a message fragment, you keep polling your FIFO. So,
> consider the case of half-duplex point-to-point traffic: one
> process only sends and the other process only receives. Previously,
> this would eventually hang. Now, it won't. But (I haven't
> confirmed 100% yet), I don't think it executes very pleasantly.
> E.g., if you have
> for ( i = 0; i < N; i++ ) {
> if ( me == 0 ) MPI_Send(...);
> if ( me == 1 ) MPI_Recv(...);
> }
> At some point, the receiver falls hopelessly behind. The sender
> keeps pumping messages and the receiver keeps polling its FIFO,
> pulling in messages and returning fragments to the sender so that
> the sender can keep on going. Problem is, all that is happening
> within one MPI_Recv call... which in a test code might be pulling in
> 100Ks of messages. The MPI_Recv call won't return until the sender
> lets up. Then, the rest of the MPI_Recv calls will execute, all
> pulling messages out of the local unexpected-message queue.
> Not sure yet how I want to manage this. The bottom line might be
> that if the MPI application has no flow control, the underlying MPI
> implementation is going to have to do something that won't make
> everyone happy. Oh well. At least the program makes progress and
> completes in reason time.
> _______________________________________________
> devel mailing list
> devel_at_[hidden]