Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |  

This web mail archive is frozen.

This page is part of a frozen web archive of this mailing list.

You can still navigate around this archive, but know that no new mails have been added to it since July of 2016.

Click here to be taken to the new web archives of this list; it includes all the mails that are in this frozen archive plus all new mails that have been sent to the list since it was migrated to the new archives.

Subject: Re: [OMPI users] Flow control in OMPI
From: Rodrigo Silva Oliveira (rsilva_at_[hidden])
Date: 2011-09-13 22:53:31


>> Are you overwhelming the receiver with short, unexpected messages such
that MPI keeps mallocing >> and mallocing and mallocing in an attempt to
eagerly receive all the messages? I ask because Open >> MPI only eagerly
sends short messages -- long messages are queued up at the sender and not
>> actually transferred until the receiver starts to receive (aka a
"rendezvous protocol").

This is probably what is happening. In general, my processes send a massive
number of short messages and it is overwhelming the receivers. As I have
some stage of computation (processes) much slower than others, the first
ones can not handle the incoming messages in the same rate they are
delivered to them.

>> Are you sure that you don't have some other kind of memory error in your
application?

I have checked and there are not memory problems within the application.

>> FWIW, you can use MPI_SSEND to do a "synchronous" send, which means that
it won't complete >> until the receiver has started to receive the message.
 This may slow your sender down dramatically, >> however. If it slows down
your sender too much, you may have to implement your own flow control.

MPI_SSEND worked for my application and I did not have the problem, but as
you said, it slows the senders. A better solution was implement my own flow
control, as suggested. I have implemented a simple credit-based flow control
scheme and it solved my problem.

Thanks a lot for the explanation and suggestions.

On Tue, Sep 6, 2011 at 9:43 AM, Jeff Squyres <jsquyres_at_[hidden]> wrote:

> Are you overwhelming the receiver with short, unexpected messages such that
> MPI keeps mallocing and mallocing and mallocing in an attempt to eagerly
> receive all the messages? I ask because Open MPI only eagerly sends short
> messages -- long messages are queued up at the sender and not actually
> transferred until the receiver starts to receive (aka a "rendezvous
> protocol").
>
> While that *can* happen, I'd be a little surprised if it did. Indeed, it
> would probably take a little while for that to happen (i.e., the time
> necessary for the receiver to malloc a small amount N times, where N is
> large enough to exhaust the virtual memory on your machine, coupled with all
> the time delay to page out all the old memory and page in on-demand as Open
> MPI scans for new incoming matches... this could be pretty darn slow). Is
> that what is happening?
>
> Are you sure that you don't have some other kind of memory error in your
> application?
>
> FWIW, you can use MPI_SSEND to do a "synchronous" send, which means that it
> won't complete until the receiver has started to receive the message. This
> may slow your sender down dramatically, however. If it slows down your
> sender too much, you may have to implement your own flow control.
>
>
> On Aug 25, 2011, at 10:58 PM, Rodrigo Oliveira wrote:
>
> > Hi there,
> >
> > I am facing some problems in an Open MPI application. Part of the
> application is composed by a sender and a receiver. The problem is that the
> sender is so much faster than the receiver, what causes the receiver's
> memory to be completely used, aborting the application.
> >
> > I would like to know if there is a flow control scheme implemented in
> open mpi or if this issue have to be treated at the user application's
> layer. If exists, how it works and how can I use it in my application?
> >
> > I did some research about this subject, but I did not find a conclusive
> explanation.
> >
> > Thanks a lot.
> > _______________________________________________
> > users mailing list
> > users_at_[hidden]
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
> --
> Jeff Squyres
> jsquyres_at_[hidden]
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>