Well, we are aware of this problem, but to be honest I was ready to
bet that nobody will use a cluster of cluster with Myrinet and
TCP ... so it was in a low priority TODO list.
The problem is the routing table of the MX device. The MX BTL is
unable to identify in a unique manner that there are not one but
multiple Myrinet networks. As long as a node report a MX handle back
at the end of MPI_Init, everybody else will try to use it if they
need to setup a MX connection. Of course this will fail for multiple
Myrinet networks. However, THIS it is not supposed to stop your MPI
application. Open MPI will deselect the MX BTL for this particular
connection and switch to TCP (if available).
On Jun 1, 2007, at 4:25 AM, Christian Kauhaus wrote:
> Kees Verstoep <versto_at_[hidden]>:
>> I am currently experimenting with OpenMPI in a multi-cluster setting
>> where each cluster has its private Myri-10G/MX network besides TCP.
> Very interesting topic. :)
>> I see MX rather than tcp-level connections between clusters being
>> tried, which across clusters fails in mx_connect/mx_isend (at the
>> moment there is no inter-cluster support in MX itself). Besides
>> I do include "tcp" in the network option lists of course.
> It seems that the BTL does not realize that the two Myrinets are not
> connected. We are currently working on getting the handling of all
> with different TCP/IP networks right (public IPv4, private IPv4,
> but to my knowledge nobody has done a detailed evaluation on Open
> MPI in
> multi-domain clusters with mixed networks (TCP+MX, TCP+IB, ...) yet.
> Dipl.-Inf. Christian Kauhaus <><
> Lehrstuhl fuer Rechnerarchitektur und -kommunikation
> Institut fuer Informatik * Ernst-Abbe-Platz 1-2 * D-07743 Jena
> Tel: +49 3641 9 46376 * Fax: +49 3641 9 46372 * Raum 3217
> users mailing list
- application/pkcs7-signature attachment: smime.p7s