Subject: [MTT users] test example connectivity_c cannot run on multiple nodes
From: ¸ßÐÇ (xinggao_at_[hidden])
Date: 2012-11-26 02:39:25


Dear mtt-users,

Recently I installed openmpi version 1.6.3 under my user home directory. After installation, I run the test examples under ~openmpi-1.6.3/examples. There is no problem when running hello_c/cxx/f77/f90. But when I try to run ring_c/cxx/f77/f90 and connectivity_c, I got problems. I chose two nodes (named node20 and node30, each node has 8 cores), and tried to run "mpirun -np 2 connectivity_c" after login onto those two nodes, I got message saying "connectivity test on 2 processes passed". There is no problem. Then I tried non-interactive test by running "ssh node20/node30 mpirun -np 2 connectivity_c" from server, there is no problem, either. I also tried to do this test by defining a machinefile "mpirun -np 2 --hostfile myhostfile.txt connectivity_c" with myhostfile.txt containing either only node20 with slots=2 or only node30 with slots=2, the test passed with no problem either. However, if I contain both node20 and node30 in myhostfile.txt, the program fails. (The program hangs there, when I run ring_c, the
program hangs on the output line "Process 0 decremented value: 9").

Has anybody ever met this kind of problem? And What should I check to solve this problem? Attached please find the ompi_info result.

Thanks in advance!

Xing