This case actually works. We run into it few days ago, when we
discovered that one of the compute nodes in a cluster didn't get his
Myrinet card installed properly ... The performance were horrible but
the application run to completion.
You will have to use the following flags: --mca pml ob1 --mca btl
mx,tcp,self
george.
On Jan 15, 2008, at 8:49 AM, M Jones wrote:
> Hi,
>
> We have a mixed environment in which roughly 2/3 of the nodes
> in our cluster have myrinet (mx 1.2.1), while the full cluster has
> gigE. Running open-mpi exclusively on myrinet nodes or exclusively
> on non-myrinet nodes is fine, but mixing the two nodes types
> results in a runtime error (PML add procs failed), no matter what --
> mca
> flags I try to use to push the traffic onto tcp (note that
> --mca mtl ^mx --mca btl ^mx does appear to use tcp, as long as all
> of the nodes have myrinet cards, but not in the mixed case).
>
> I thought that we would be able to use a single open-mpi build to
> support both networks (and users would be able to request mx nodes if
> they need them using the batch queuing system, which they are
> already accustomed to). Am I missing something (or just doing
> something dumb)? Compiling mpi implementations for each compiler
> suite
> is bad enough, add in separate builds for networks and it just gets
> worse ...
>
> Matt
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
- application/pkcs7-signature attachment: smime.p7s
|