OK, I've got a system set up so that it can use uDAPL over IB (! OFED, !
Mellanox, though) on Linux.
Running simple dapl test programs (shamelessly pulled from the OFED tree)
seems to verify that DAPL is in fact operating properly.
After searching through the mail archives, I found a small test code by Donald
Kerr (dat_reg.c), and compiled an ran that successfully. When run, it
returns the name of the DAT name (ib0)
I've also been able to run programs using uDAPL with Intel MPI, for example.
I'm fairly sure uDAPL is working.
However, when I attempt to run an MPI program over uDAPL (--mca btl
udapl,sm,self), I receive the following error:
WARNING: Failed to open "ib0"
This may be a real error or it may be an invalid entry in the uDAPL
Registry which is contained in the dat.conf file. Contact your local
System Administrator to confirm the availability of the interfaces in
the dat.conf file.
[0,1,0]: uDAPL on host n02 was unable to find any NICs.
I've also tried using --mca btl_udapl_if_include ib0, but that doesn't seem to
have any effect.
Interestingly enough, when I don't specify a DAT provider, and I play with the
name in /etc/dat.conf, Open MPI seems aware of the name change; it will
list 'failed to open "newname"'
my /etc/dat.conf looks like this:
InfiniHost0 u1.1 nonthreadsafe default /usr/lib64/libdapl.so ri.1.1 " " " "
Any ideas on why I'm not able to get Open MPI to use uDAPL?