> -----Original Message-----
> From: users-bounces_at_[hidden]
> [mailto:users-bounces_at_[hidden]] On Behalf Of Keith Refson
> Sent: Tuesday, July 18, 2006 6:21 AM
> To: Open MPI Users
> Subject: Re: [OMPI users] Openmpi, LSF and GM
> > > The arguments you want would look like:
> > >
> > > mpirun -np X -mca btl gm,sm,self -mca btl_base_verbose 1 -mca
> > > btl_gm_debug 1 <other arguments>
> Aha. I think I had misunderstood the syntax slightly, which
> explains why
> I previously saw no debugging information. I has also
> omitted the "sm"
> btl - though I'm not sure what that one is....
"sm" = "shared memory". It's used for on-node communication between
> I am now getting some debugging output
> [scarf-cn008.rl.ac.uk:04291] [0,1,0] gm_port 017746B0, board
> 545460846592, global 3712550725 node
> which I home means that I am using the GM btl. The run is
> also about 20% quicker than
> before which may suggest that I was not previously using gm.
It does. Excellent!
> I have also noticed that if I simply specify --mca btl ^tcp +
> the debugging options
> the run works apparently uses gm, and as quickly. It was
> (and is) the combination
> -mca btl gm,sm,self,^tcp
> that fails with
> No available btl components were found!
The syntax only allows you to do the "^" notation if that's *all* you
use. Check out this FAQ entry (I just expanded its text a bit):
More specifically, you cannot mix the "^" and non-"^" notation -- it
doesn't make sense. Here's why -- if you list:
--mca btl a,b,c
This tells Open MPI to *only* use components a, b, and c. Using the
exclusive behavior, thus:
--mca btl ^d
means "use all components *except* d". Hence, doing this:
--mca btl a,b,c,^d
would assumedly mean "only use a, b, and c" and "use all components
*except* d", which doesn't make sense. Taking a looser definition of
the inclusive and exclusive behavior, you could interpret it to mean
"use only a, b, and c, and *not* use d" -- but that would be redundant
because it's already *not* going to use d because it's *only* using a,
b, and c.
Hence, the inclusive and exclusive notations are mutually exclusive.
Indeed, the ^ character is only recognized as a prefix for the whole
value for this exact reason. This is why you got the error that you did
-- when you used "tcp,sm,^gm", it was looking for a component named
"^gm" since the "^" was not recognized as the exclusion character (and
therefore didn't find it).
I'll add some detection code such that if we find "^" in the string and
it's not the first chacter to emit a warning.
> > > LSF. I believe it is on our feature request list, but I
> also don't
> > > believe we have a timeline for implementation.
> OK. It is actually quite easy to construct a hostfile from the LSF
> environment and start the processes using the openmpi mpirun command.
> I don't know how this will interact with for larger scale usage,
> job termination etc but I plan to experiment.
If you use the LSF drop-in replacement for rsh (lsgrun), you should be
ok because it will use LSF's native job-launching mechanisms behind the
scenes (and therefore can use LSF's native job-termination mechanisms
> One further question. My run times are still noticably longer than
> with mpich_gm. I saw in the mailing list archives that there was
> a new implementation of the collective routines in 1.0,
> (which my application
> depends on rather heavil. Is this the default in openmpi 1.1 or is
The new collectives were introduced in 1.1, not 1.0, and yes, they are
> it still necessary to specify this manually? And if anyone
> has a comparison
> of MPI_AlltoallV performance with other MPI implementations
> I'd like to
> hear the numbers.
There is still work to be done in the collectives, however -- there were
no optimized "vector" algorithms introduced yet (e.g., MPI_Alltoallv).
> Thanks again for all the work. Openmpi looks very promising and it is
> definitely the easiest to install and get running of any MPI
> I have tried so far.
Glad to hear it -- thanks for the feedback!
Server Virtualization Business Unit