Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

From: Chris Reeves (chris.reeves_at_[hidden])
Date: 2007-06-21 13:09:06


Thanks for all your replies and sorry for the delay in getting back to you.

On Tue, Jun 19, 2007 at 01:40:21PM -0400, Jeff Squyres wrote:
> On Jun 19, 2007, at 9:18 AM, Chris Reeves wrote:
>
> > Also attached is a small patch that I wrote to work around some firewall
> > limitations on the nodes (I don't know if there's a better way to do this
> > - suggestions are welcome). The patch may or may not be relevant, but I'm
> > not ruling out network issues and a bit of peer review never goes amiss
> > in case I've done something very silly.
>
> From the looks of the patch, it looks like you just want Open MPI to
> restrict itself to a specific range of ports, right? If that's the
> case, we'd probably do this slightly differently (with MCA parameters
> -- we certainly wouldn't want to force everyone to use a hard-coded
> port range). Brian's also re-working some TCP and OOB issues on a /
> tmp branch right now; we'd want to wait until he's done before
> applying a similar patch.

I thought that would be the 'official', configurable way to do it. But I lack
a thorough enough understanding of how everything fits together to implement
it in that way.

> My first question is: why are you calling MPI_BARRIER? ;-)

Good question. Thinking about it, not all occurences are probably necessary. I
didn't write this code, but I will discuss this with my colleague.

> Clearly, if we're getting stuck in there, it could be a bug. Have
> you run your code through a memory-checking debugger? It's hard to
> say exactly what the problem is without more information -- it could
> be your app, it could be OMPI, it could be the network, ...
>
> It's a good datapoint to run with other MPI implementations, but "it
> worked with MPI X" isn't always an iron-clad indication that the new
> MPI is at fault. I'm not saying we don't have bugs in Open MPI :-)
> -- I'm just saying that I agree with you: more data is necessary.

The code is compiled with debugging turned on (with gcc's -g flag). I believe
that this does a certain degree of memory checking, but I'd have to look it up
to make sure...

Indeed. I'm not necessarily blaming OpenMPI :-p The above was merely, as you
say, an additional datapoint.

> > (gdb) where
> > #0 0x9000121c in sigprocmask ()
> > #1 0x01c46f96 in opal_evsignal_recalc ()
> > #2 0x01c458c2 in opal_event_base_loop ()
> > #3 0x01c45d32 in opal_event_loop ()
> > #4 0x01c3e6f2 in opal_progress ()
> > #5 0x01b6083e in ompi_request_wait_all ()
> > #6 0x01ec68d8 in ompi_coll_tuned_sendrecv_actual ()
> > #7 0x01ecbf64 in ompi_coll_tuned_barrier_intra_bruck ()
> > #8 0x01b75590 in MPI_Barrier ()
>
> Just a quick sanity check: I assume the call stack is the same on all
> processes, right? I.e., ompi_coll_tuned_barrier_intra_bruck () is
> the call right after MPI_BARRIER?

It is similar. Obviously different processes are at different points in the
loop when I attach, but the traces are similar enough. All of them have
ompi_coll_tuned_barrier_intra_bruck in the stack after MPI_Barrier.

> > What if some packets went missing on the network? Surely TCP should take
> > care of this and resend?
>
> What is the topology of the network that you're running on?

9 machines are physically co-located and each have a single connection to one
of two linked switches. The 10th machine is in a different part of the
building, but on the same subnet (off a different switch). All machines can
talk to each other under normal conditions.

> > As implied by my line of questioning, my current thoughts are that some
> > messages between nodes have somehow gone missing. Could this happen? What
> > could cause this? All machines are on the same subnet.
>
> Hmm. On a single subnet, but you need the firewall capability -- are
> they physically remote from each other, or do you just have the local
> firewalling capabilities enabled on each node?

Each node has a local firewall set up by the systems administrator, who was
persuaded to poke a 'small' (1000-port) hole in said firewall for
communication between the nodes. There are no further firewalls between the
nodes. The firewalls are there to stay.

Cheers for all your help so far,
    Chris