Hi all -
There was a bug in version 1.6.1 that caused singleton spawn not to
work correctly with multi-machine configurations. I verified that a
nightly build of 1.6.2 fixed this issue, in particular 1.6.2a1r27234
works. I just grabbed the 1.6.2 official release, and it appears that
somehow the fix has been removed.
I am testing with the simple_spawn.c example. Instead of passing
MPI_INFO_NULL to the spawn call, I create an Info object, and set the
"add-host" to a comma delimited list of nodes. When I run this in the
nightly mentioned above, without mpirun, everything works as expected
(the nodes I list in the "add-host" are in the output text of the
program). When I run the same code with the released 1.6.2, I get the
old behavior from 1.6.1 where all slaves run on localhost.
Thanks,
Brian
|