Hi,
you should probably use -mca tcp,self -mca btl_openib_if_include ib0.8109
Lenny.
Hi,
I'm trying to get openmpi working over openib partitions. On this cluster, the partition number is 0x109. The ib interfaces are pingable over the appropriate ib0.8109 interface:
d2:/opt/openmpi-ib # ifconfig ib0.8109
ib0.8109 Link encap:UNSPEC HWaddr 80-00-00-4A-FE-80-00-00-00-00-00-00-00-00-00-00
inet addr:10.21.48.2 Bcast:10.21.255.255 Mask:255.255.0.0
inet6 addr: fe80::202:c902:26:ca01/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:65520 Metric:1
RX packets:16811 errors:0 dropped:0 overruns:0 frame:0
TX packets:15848 errors:0 dropped:1 overruns:0 carrier:0
collisions:0 txqueuelen:256
RX bytes:102229428 (97.4 Mb) TX bytes:102324172 (97.5 Mb)
I have tried the following:
/opt/openmpi-ib/1.2.6/bin/mpirun -np 2 -machinefile machinefile -mca btl openib,self -mca btl_openib_max_btls 1 -mca btl_openib_ib_pkey_val 0x8109 -mca btl_openib_ib_pkey_ix 1 /cluster/pallas/x86_64-ib/IMB-MPI1
but I just get a RETRY EXCEEDED ERROR. Is there a MCA parameter I am missing?
I was successful using tcp only:
/opt/openmpi-ib/1.2.6/bin/mpirun -np 2 -machinefile machinefile -mca btl tcp,self -mca btl_openib_max_btls 1 -mca btl_openib_ib_pkey_val 0x8109 /cluster/pallas/x86_64-ib/IMB-MPI1
Thanks,
Matt Burgess
_______________________________________________
users mailing list
users@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users