On Wed, Apr 29, 2009 at 12:38 PM, Jerry Ye <jerryye_at_[hidden]> wrote:
> Im currently working in an environment where I cannot use SSH to launch
> child processes. Instead, the process with rank 0 skips the ssh_child
> function in plm_rsh_module.c and the child processes are all started at the
> same time on different machines. Coordination is done with static jobids
> and ports. I have sucessfully modified the code to get the hello_c example
> working. However, Im having problems with inter-process communication when
> using MPI_Bcast. Is there something else that Im obviously missing?
Does your remote invocation method setup environment variables on for
slave tasks correctly??
I remember MPICH relies on env variables to pass rank and other
information from the rank 0 process to processes with non-zero ranks.
(I have not looked at how things are handled in Open MPI in detail...)
If you loop through all the environment variables using a " while
(*environ != NULL) printf("%s\n", *environ++); " loop, and compare an
MPI job started using your remote invocation method vs. the standard
one, then you can find out the answer easily.
And if you are using Grid Engine or Torque, then the integration with
Open MPI is already implemented. May be you are using Hadoop+something
> - jerry
> devel mailing list