I have trouble to communicate through sm btl in open MPI, please
check the attached file for my system information. I am using open
MPI 1.4.3, intel compilers V11.1, on linux RHEL 5.4 with kernel 2.6.
The tests are the following:
(1) if I specify the btl to mpirun by "--mca btl self,sm,openib", if I did
not specify any of my computing nodes twice or more in the node
list, my job runs fine. However, if I specify any of the computing
nodes twice or more in the node list, it will hang there forever.
(2) if I did not specify the sm btl to mpirun as "--mca btl
self,openib", I could run my job smoothly, either put any of the
computing nodes twice or more in the node list, or not.
>From above 2 tests, apparently something wrong with sm btl
interface on my system. As I checked the user archive, sm btl
issue has been encountered due to the comm_spawned
parent/child processes. But this seems not the case here, if I do
not use any of my MPI based solver, only with MPI initialization and
finalization procedures called, it still has this issue.
The following section of this message contains a file attachment
prepared for transmission using the Internet MIME message format.
If you are using Pegasus Mail, or any another MIME-compliant system,
you should be able to save it or view it from within your mailer.
If you cannot, please ask your system administrator for assistance.
---- File information -----------
Date: 9 Feb 2012, 8:58
Size: 126316 bytes.