Sorry for the delay in replying -- I thought I had replied to this
already, but I guess I hadn't. :-(
We've talked about this feature several times, but this specific
functionality hasn't made it into the OMPI code base yet. Sorry! :-(
(patches would be gladly accepted, but note that we'll likely be kinda
picky about this code because it's a little hairy and complex...)
On Sep 19, 2008, at 7:00 PM, Jeroen Kleijer wrote:
> I'm trying to get an openmpi application running accross different
> nodes but seem to have hit a snag when the processes are on different
> nodes, especially when the machines are on different TCP subnets.
> The orted daemons start up fine but after that application borks with
> the message
> connect() failed with errno=111
> I've read in this thread
> that openmpi currently can't do this yet but (pre-release?) versions
> of openmpi 1.3 will work.
> I've tried compiling openmpi 1.3a (nightly build) and running my
> program with that (compiled with the mpicc of openmpi 1.3a) but I got
> the same error message.
> Can anybody confirm that:
> 1) openmpi has dificulties using the tcp btl accross different subnets
> 2) there are currently no workarounds for this.
> If there are solutions to this I'd really like to know about it as
> I've been trying this for quite a while now.
> Jeroen Kleijer
> users mailing list