Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] help: sm btl does not work when I specify the same host twice or more in the node list
From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2012-02-10 15:50:34


Can you provide a specific example?

I'm able to do this just fine, for example (with the upcoming OMPI 1.4.5):

mpirun --host svbu-mpi001,svbu-mpi001,svbu-mpi002,svbu-mpi002 --mca btl sm,openib,self ring

On Feb 9, 2012, at 9:31 AM, yanyg_at_[hidden] wrote:

> Hi all,
>
> Good morning!
>
> I have trouble to communicate through sm btl in open MPI, please
> check the attached file for my system information. I am using open
> MPI 1.4.3, intel compilers V11.1, on linux RHEL 5.4 with kernel 2.6.
>
> The tests are the following:
>
> (1) if I specify the btl to mpirun by "--mca btl self,sm,openib", if I did
> not specify any of my computing nodes twice or more in the node
> list, my job runs fine. However, if I specify any of the computing
> nodes twice or more in the node list, it will hang there forever.
>
> (2) if I did not specify the sm btl to mpirun as "--mca btl
> self,openib", I could run my job smoothly, either put any of the
> computing nodes twice or more in the node list, or not.
>
>> From above 2 tests, apparently something wrong with sm btl
> interface on my system. As I checked the user archive, sm btl
> issue has been encountered due to the comm_spawned
> parent/child processes. But this seems not the case here, if I do
> not use any of my MPI based solver, only with MPI initialization and
> finalization procedures called, it still has this issue.
>
> Any comments?
>
> Thanks,
> Yiguang
>
> The following section of this message contains a file attachment
> prepared for transmission using the Internet MIME message format.
> If you are using Pegasus Mail, or any another MIME-compliant system,
> you should be able to save it or view it from within your mailer.
> If you cannot, please ask your system administrator for assistance.
>
> ---- File information -----------
> File: ompiinfo-config-uname-output.tgz
> Date: 9 Feb 2012, 8:58
> Size: 126316 bytes.
> Type: Unknown
> <ompiinfo-config-uname-output.tgz>_______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users

-- 
Jeff Squyres
jsquyres_at_[hidden]
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/