Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |  

This web mail archive is frozen.

This page is part of a frozen web archive of this mailing list.

You can still navigate around this archive, but know that no new mails have been added to it since July of 2016.

Click here to be taken to the new web archives of this list; it includes all the mails that are in this frozen archive plus all new mails that have been sent to the list since it was migrated to the new archives.

Subject: Re: [OMPI devel] [OMPI svn] svn:open-mpi r17307
From: Adrian Knoth (adi_at_[hidden])
Date: 2008-01-30 03:17:03

On Tue, Jan 29, 2008 at 07:37:42PM -0500, George Bosilca wrote:

> The previous code was correct. Each IP address correspond to a
> specific endpoint, and therefore to a specific BTL. This enable us to
> have multiple TCP BTL at the same time, and allow the OB1 PML to
> stripe the data over all of them.
> Unfortunately, your commit disable the multi-rail over TCP. Please
> undo it.

That's exactly what I had in mind when I said "this might break

So we need as many endpoints as IP addresses? Then, simply connecting
them leads to oversubscription: two parallel connections on the same
media. That's where the kernel index enters the scene: we'll have to
make sure not to open two parallel connections to the same remote kernel

I'll revert the patch and come up with another solution, but for the
moment, let me point out that the assumption "One interface, one
address" isn't true. So, the previous code was also wrong.

I hope not to run into model limitations: avoiding oversubscription
means to keep the number of endpoints per peer lower than the amount of
his interfaces, but accepting incoming connections from this peer means
to have all his addresses (probably more than #remote_NICs) available in
order to accept them.

As mentioned earlier: it's very common to have multiple addresses per
interface, and it's the kernel who assigns the source address, so
there's nothing one could say about an incoming connection. Only that it
could be any of all exported addresses. Any.

Cluster and Metacomputing Working Group
Friedrich-Schiller-Universität Jena, Germany