Hi Changsheng

Thank you very much for your solution. The program runs well now :)

Regards.

On Tue, Apr 20, 2010 at 3:54 PM, Changsheng Jiang <jiangzuoyan@gmail.com> wrote:
I have encountered the same problem too.

By gdb attached, it's show that the processes are in a loop of (e)poll. After configuring the network interface in ~/.openmpi/mca-params.conf using btl_tcp_if_include, all hosts work fine.

just fyi.
                                                     Changsheng Jiang


On Tue, Apr 20, 2010 at 14:39, long thai <thaithanhlong2501@gmail.com> wrote:

Hi all.

I'm just using OpenMPI for few days. I'll try to run a simple MPI program, the program is ProcessColors which I get from CI-Tutor. I have 2 hosts, if I run the program separately on each one, it runs well. However, if I run it on two hosts using following command: mpirun --host host1,host2 --preload-binary -np 8 ProcessColors. The program hangs.

When I use command ps -A to check running process, I find out that there is 4 processes running on each host. So, I think that there is a deadlock on my program, but why it runs well with single host?

All those following commands run without any problem on both machine:

  • mpirun -np 8 ProcessColors
  • mpirun --host host1 -np 8 ProcessColors
  • mpirun --host host2 -np 8 ProcessColors
Later, I found out that the problem comes when the remote host try to send message to the host which root process (process 0) is running, which is the host that I run the command. I don't know why the process is blocked at sending task.

Any help from you is precious to me.

Regards.

Long Thai.



_______________________________________________
users mailing list
users@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users


_______________________________________________
users mailing list
users@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users