Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] initial SCTP BTL commit comments?
From: Brad Penoff (penoff_at_[hidden])
Date: 2007-11-20 13:46:54

On Nov 19, 2007 4:49 PM, Jeff Squyres <jsquyres_at_[hidden]> wrote:
> Are there API functions or data structures that can be used to
> determine if the 1-to-many model is supported on the system?

I don't see how this will be possible because the 1-to-many model is
supported in some way on all systems... I'll try to explain more
clearly below...

> More specifically: can you have your configure.m4 script check to see
> if the current system a) supports SCTP,

Yes, the current configure.m4 does this by making use of OMPI_CHECK_PACKAGE.

> and b) if yes, if it supports 1-to-many? This kind of checking would theoretically
> allow running on Solaris

This is a little more tricky.

(Once again, I'm speaking as how things were with SCTP in Solaris
based on a Nov 2006 email, so someone please correct me if I'm wrong!)

Solaris has 1-to-many support... just it makes different assumptions.

In general on all platforms support SCTP, with a one-to-many socket,
you can implicitly establish an "association" (what SCTP calls a
multihomed connection) by just calling sctp_sendmsg and passing the
appropriate sockaddr; no explicit connect() is required.

Use of sctp_sendmsg is supported on a Solaris one-to-many socket for
*only* the initial association establishment; after this first call,
one must query the socket to find out the assigned "association ID" to
that association and then use that as a parameter to another function
called sctp_send in order to send data. At least, this is how it was
explained to me; I've never played with this myself yet and am not
sure if this approach would work on other platforms. How would an
autoconf rule go that far into determining the underlying stack's
assumed one-to-many semantics?

>, but automatically default to the 1-to-1 mode (if your BTL supports that).

Hmm, I suppose you're right. We could just make Solaris set the MCA
variable btl_sctp_if_11 to 1 in order to use the 1-to-1 mode to avoid
this mess. How would one change the default of an MCA variable in an
autoconf rule? I really hope there's a way to keep one-to-many the
default as often as possible (if not always).

You can tell that I am not as good at autoconf as the rest of you are!
 Bearing that in mind, I actually had another question of my own...

The SCTP API is typically within it's own library called libsctp.
However, in FreeBSD 7, the API is within libc. So say we're looking
for something like sctp_recvmsg (as we do now)... what is the best way
to structure an autoconf rule to look for this in either libsctp or
libc, and to not complain if libsctp doesn't exist? Should I just
call OMPI_CHECK_PACKAGE once with libsctp and if that fails then call
OMPI_CHECK_PACKAGE again with libc?

> This also falls in-line with the autoconf mantra: test for the desired
> behavior, not the desired platform (because the list of supported
> platforms may change over time). :-)

Good point. At the moment, yes the configure.m4 just looks for
particular platforms namely those that I've had the time to try.
Hopefully in the future instead of specifying those that work, I can
instead specify those that don't, but for now with such a young stack,
it might make more sense to be pessimistic and assume that a given
platform will not work. It will make maintaining the BTL easier as
well ;-)

Just my $.02...


> On Nov 14, 2007, at 1:17 PM, Brad Penoff wrote:
> > On Nov 14, 2007 5:11 AM, Terry Dontje <Terry.Dontje_at_[hidden]> wrote:
> >>
> >> Brad Penoff wrote:
> >>> On Nov 12, 2007 3:26 AM, Jeff Squyres <jsquyres_at_[hidden]> wrote:
> >>>
> >>>> I have no objections to bringing this into the trunk, but I agree
> >>>> that
> >>>> an .ompi_ignore is probably a good idea at first.
> >>>>
> >>>
> >>> I'll try to cook up a commit soon then!
> >>>
> >>>
> >>>> One question that I'd like to have answered is how OMPI decides
> >>>> whether to use the SCTP BTL or not. If there are SCTP stacks
> >>>> available by default in Linux and OS X -- but their performance
> >>>> may be
> >>>> sub-optimal and/or buggy, we may want to have the SCTP BTL only
> >>>> activated if the user explicitly asks for it. Open MPI is very
> >>>> concerned with "out of the box" behavior -- we need to ensure that
> >>>> "mpirun a.out" will "just work" on all of our supported platforms.
> >>>>
> >>>
> >>> Just to make a few things explicit...
> >>>
> >>> Things would only work out of the box on FreeBSD, and there the
> >>> stack
> >>> is very good.
> >>>
> >>> We have less experience with the Linux stack but hope the
> >>> availability
> >>> of and SCTP BTL will help encourage its use by us and others. Now
> >>> it
> >>> is a module by default (loaded with "modprobe sctp") but the actual
> >>> SCTP sockets extension API needs to be downloaded and installed
> >>> separately. The so-called lksctp-tools can be obtained here:
> >>>
> >>>
> >>> The OS X stack does not come by default but instead is a kernel
> >>> extension:
> >>>
> >>> I haven't yet started this testing but intend to soon. As of now
> >>> though, the supplied configure.m4 does not try to even build the
> >>> component on Mac OS X.
> >>>
> >>> So in my opinion, things in the configure scripts should be fine the
> >>> way the are since only FreeBSD stack (which we have confidence in)
> >>> will try to work out of the box; the others require the user to
> >>> install things.
> >>>
> >
> > Greetings,
> >
> >> I am gathering from the text above you haven't tried your BTL on
> >> Solaris
> >> at all.
> >
> > The short answer to that is correct, we haven't tried the Open MPI
> > SCTP BTL yet on Solaris. In fact, the configure.m4 file checks the
> > $host value and only tries to build if it's on Linux or a BSD variant.
> > Mac OS X uses the same code as BSD but I have only just got my hands
> > on a machine so even it hasn't been tested yet; Solaris remains on the
> > TODO list.
> >
> > However, there's a slightly longer answer...
> >
> > After a series of emails with the Sun SCTP people
> > (sctp-questions_at_[hidden] but mostly Kacheong Poon) a year ago, I
> > learned SCTP support is within Solaris 10 by default. In general,
> > SCTP supports its own socket API, in addition to the standard Berkeley
> > sockets API; the SCTP-specific sockets API unlocks some of SCTP's
> > newer features (e.g, multistreaming). We make use of this
> > SCTP-specific sockets API.
> >
> > The Solaris stack (as of a year ago) made certain assumptions about
> > the SCTP-specific sockets API. I'm just looking back on those emails
> > now to refresh my memory... it looks like on the Solaris stack as of
> > Nov 2006, it did not allow the use one-to-many sockets (the current
> > default in our BTL) together with the sctp_sendmsg call. They
> > mentioned an alternative just we didn't have the time to explore it.
> > I'm not sure if this has changed on the Solaris stack within the past
> > year... I never got the time to revisit this.
> >
> > In the past, we had mostly used the one-to-many socket (with our LAM
> > and MPICH2 versions). One unique thing about this Open MPI SCTP BTL
> > is that there is also a choice to make use of (the more TCP-like)
> > one-to-one socket style. The socket style used by the SCTP BTL is
> > adjustable with the MCA parameter btl_sctp_if_11 (if set to 1, it uses
> > 1-1 sockets; by default it is 0 and uses 1-many). I've never used
> > one-to-one sockets on the Solaris stack, but it may have a better
> > chance of working (also one-to-many may work now; I haven't kept
> > up-to-date).
> >
> > We also noticed that on Solaris we had to do some things a little
> > different with iovec's because the struct msghdr (used by sendmsg) had
> > no msg_control field; to get around this, we had to pack the iovec's
> > contents into a buffer and send that buffer instead of using the iovec
> > directly.
> >
> > Anyway, hope this fully answers your questions. In general, it'd be
> > nice if we have the time/assistance to add in Solaris support
> > eventually.
> >
> > brad
> >
> >>
> >> --td
> >>
> >> _______________________________________________
> >> devel mailing list
> >> devel_at_[hidden]
> >>
> >>
> >>
> > _______________________________________________
> > devel mailing list
> > devel_at_[hidden]
> >
> --
> Jeff Squyres
> Cisco Systems
> _______________________________________________
> devel mailing list
> devel_at_[hidden]