On Apr 29, 2009, at 1:38 PM, Jerry Ye wrote:
> Im currently working in an environment where I cannot use SSH to
> launch child processes. Instead, the process with rank 0 skips the
> ssh_child function in plm_rsh_module.c and the child processes are
> all started at the same time on different machines. Coordination is
> done with static jobids and ports. I have sucessfully modified the
> code to get the hello_c example working.
Excellent. What mechanism are you using to start your jobs? Would it
be easier to fork the rsh plm into your own plugin? Is this code you
can share with the community?
> However, Im having problems with inter-process communication when
> using MPI_Bcast. Is there something else that Im obviously missing?
The PLM just starts up jobs -- other plugins are used for MPI
communications. E.g., the TCP BTL is probably what you're using for
MPI communications. Is that where it's failing?