Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

From: Jean-Christophe Hugly (jice_at_[hidden])
Date: 2006-02-07 19:03:03


On Thu, 2006-02-02 at 21:49 -0700, Galen M. Shipman wrote:
>
> I suspect the problem may be in the bcast,
> ompi_coll_tuned_bcast_intra_basic_linear. Can you try the same run using
>
> mpirun -prefix /opt/ompi -wdir `pwd` -machinefile /root/machines -np
> 2 -mca coll self,basic -d xterm -e gdb PMB-MPI1
>
>
> This will use the basic collectives and may isolate the problem.

Hi Galen,

After much fiddling around, running with verbose, trace etc. I found one
way of making it work. That might explain why it normally works for you
and not me: I have two active ports on each of my nodes, not 1.

After disconnecting port1 on each node, open-mpi works.

What caused me to try that is the traces I got while running the trivial
osu_lat test:
bench1:~ # mpirun -prefix /opt/ompi -wdir `pwd` -mca btl_base_debug 2 -mca btl_base_verbose 10 -mca coll basic -machinefile /root/machines -np 2 osu_lat
[0,1,0][btl_openib.c:150:mca_btl_openib_del_procs] TODO

[0,1,0][btl_openib.c:150:mca_btl_openib_del_procs] TODO

[0,1,1][btl_openib.c:150:mca_btl_openib_del_procs] TODO

[0,1,1][btl_openib.c:150:mca_btl_openib_del_procs] TODO

# OSU MPI Latency Test (Version 2.1)
# Size Latency (us)
[0,1,1][btl_openib_endpoint.c:756:mca_btl_openib_endpoint_send] Connection to endpoint closed ... connecting ...
[0,1,1][btl_openib_endpoint.c:394:mca_btl_openib_endpoint_start_connect] Initialized High Priority QP num = 263174, Low Priority QP num = 263175, LID = 5
[0,1,1][btl_openib_endpoint.c:317:mca_btl_openib_endpoint_send_connect_data] Sending High Priority QP num = 263174, Low Priority QP num = 263175, LID = 5
[0,1,0][btl_openib_endpoint.c:594:mca_btl_openib_endpoint_recv] Received High Priority QP num = 263174, Low Priority QP num 263175, LID = 5
[0,1,0][btl_openib_endpoint.c:450:mca_btl_openib_endpoint_reply_start_connect] Initialized High Priority QP num = 4719622, Low Priority QP num = 4719623, LID = 3
[0,1,0][btl_openib_endpoint.c:339:mca_btl_openib_endpoint_set_remote_info] Setting High Priority QP num = 263174, Low Priority QP num 263175, LID = 5
[0,1,0][btl_openib_endpoint.c:317:mca_btl_openib_endpoint_send_connect_data] Sending High Priority QP num = 4719622, Low Priority QP num = 4719623, LID = 3
[0,1,1][btl_openib_endpoint.c:594:mca_btl_openib_endpoint_recv] Received High Priority QP num = 4719622, Low Priority QP num 4719623, LID = 3
[0,1,1][btl_openib_endpoint.c:339:mca_btl_openib_endpoint_set_remote_info] Setting High Priority QP num = 4719622, Low Priority QP num 4719623, LID = 3
[0,1,1][btl_openib_endpoint.c:317:mca_btl_openib_endpoint_send_connect_data] Sending High Priority QP num = 263174, Low Priority QP num = 263175, LID = 5
[0,1,0][btl_openib_endpoint.c:594:mca_btl_openib_endpoint_recv] Received High Priority QP num = 263174, Low Priority QP num 263175, LID = 5
[0,1,0][btl_openib_endpoint.c:317:mca_btl_openib_endpoint_send_connect_data] Sending High Priority QP num = 4719622, Low Priority QP num = 4719623, LID = 3
[0,1,1][btl_openib_endpoint.c:594:mca_btl_openib_endpoint_recv] Received High Priority QP num = 4719622, Low Priority QP num 4719623, LID = 3
[0,1,0][btl_openib_endpoint.c:773:mca_btl_openib_endpoint_send] Send to : 1, len : 32768, frag : 0xdb7080
[0,1,0][btl_openib_endpoint.c:756:mca_btl_openib_endpoint_send] Connection to endpoint closed ... connecting ...
[0,1,0][btl_openib_endpoint.c:394:mca_btl_openib_endpoint_start_connect] Initialized High Priority QP num = 4719624, Low Priority QP num = 4719625, LID = 4
[0,1,0][btl_openib_endpoint.c:317:mca_btl_openib_endpoint_send_connect_data] Sending High Priority QP num = 4719624, Low Priority QP num = 4719625, LID = 4
[0,1,1][btl_openib_endpoint.c:594:mca_btl_openib_endpoint_recv] Received High Priority QP num = 4719624, Low Priority QP num 4719625, LID = 4

...Then noting else happens.

You'll notice the appearance of LID = 4 towards the end.
In this context, port1 of node 0 has LID 3, port2 of node 0 has LID 4, port1 of node 1 has LID 5, and port2 of node 1 has LID 6

In case it is usefull to you, the topology of the fabric is as follows:
there are two IB switches, one switch connects to port 1 of all nodes,
and the other connects to port 2 of all nodes. Two nodes are used to run
MPI apps. The third is where opensm and other stuff runs. The two
switches are normally cross-connected many times over. I tried the same
experiment both with the cross-connection and with the two planes
seggregated. In the later case, I ran a second opensm bound to the
second plane.

The test ran to completion only after I suppressed the second plane
completely by disconnecting the second switch. In that case the tuned
collectives work just as well, btw.

The ability to run with two ports active is very important to us. Not
only are we very much interrested by ompi's multi-rail feature, but also
we use IB for other things than MPI and spread the load over the two
ports.

Is there a special way of configuring ompi for it to work properly with
multiple ports ?

-- 
Jean-Christophe Hugly <jice_at_[hidden]>
PANTA