Open MPI logo

FAQ:
Tuning the run-time characteristics of MPI UDAPL communications

  |   Home   |   Support   |   FAQ   |   all just the FAQ

Table of contents:

  1. What versions of Open MPI contain support for uDAPL?
  2. What is different between Sun Microsystems ClusterTools 7 and Open MPI in regards to the uDAPL BTL?
  3. What values are expected to be used by the btl_udapl_if_include and btl_udapl_if_exclude mca parameter?
  4. Where is the static uDAPL Registry found?
  5. How come the value reported by "ifconfig" is not accepted by the btl_udapl_if_include/btl_udapl_if_exclude MCA parameter?
  6. I get a warning message about not being able to register memory and possibly out of privileged memory while running on Solaris, what can I do?


1. What versions of Open MPI contain support for uDAPL?

The following versions of Open MPI contain support for uDAPL:

Open MPI series uDAPL supported
v1.0 series No
v1.1 series No
v1.2 series Yes
v1.3 / v1.4 series Yes
v1.5 / v1.6 series Yes
v1.7 and beyond No


2. What is different between Sun Microsystems ClusterTools 7 and Open MPI in regards to the uDAPL BTL?

Sun's ClusterTools is based off of Open MPI with one significant difference: Sun's ClusterTools includes uDAPL RDMA capabilities in the uDAPL BTL. Open MPI v1.2 uDAPL BTL does not include the RDMA capabilities. These improvements do exist today in the Open MPI trunk and will be included in future Open MPI releases.


3. What values are expected to be used by the btl_udapl_if_include and btl_udapl_if_exclude mca parameter?

The uDAPL BTL looks for a match from the uDAPL static registry which is contained in the dat.conf file. Each non commented or blank line is considered an interface. The first field of each interface entry is the value which must be supplied to the mca parameter in question.

Solaris Example:

shell% datadm -v
ibd0  u1.2  nonthreadsafe  default  udapl_tavor.so.1  SUNW.1.0  " "  "driver_name=tavor"
shell% mpirun --mca btl_udapl_if_include ibd0 ...

Linux Example:

shell% cat /etc/dat.conf
OpenIB-cma u1.2 nonthreadsafe default /usr/local/ofed/lib64/libdaplcma.so dapl.1.2 "ib0 0" ""
OpenIB-bond u1.2 nonthreadsafe default /usr/local/ofed/lib64/libdaplcma.so dapl.1.2 "bond0 0" ""
shell% mpirun --mca btl_udapl_if_exclude OpenIB-bond ...


4. Where is the static uDAPL Registry found?

Solaris: /etc/dat/dat.conf

Linux: /etc/dat.conf


5. How come the value reported by "ifconfig" is not accepted by the btl_udapl_if_include/btl_udapl_if_exclude MCA parameter?

uDAPL queries a static registry defined in the dat.conf file to find available interfaces which can be used. As such, the uDAPL BTL needs to match the names found in the registry and these may differ from what is reported by "ifconfig".


6. I get a warning message about not being able to register memory and possibly out of privileged memory while running on Solaris, what can I do?

The error message probably looks something like this:

WARNING: The uDAPL BTL is not able to register memory. Possibly out of
allowed privileged memory (i.e. memory that can be pinned). Increasing
the allowed privileged memory may alleviate this issue.

One thing to do is increase the amount of available privileged memory. On Solaris your system adminstrator can increase the amount of available privileged memory by editing the /etc/project file on the nodes. For more information see Solaris "project" man page.

shell% man project

As an example of increasing the privileged memory first determine the amount available (example of typical value is 978MB):

shell% prctl -n project.max-device-locked-memory -i project default
NAME    PRIVILEGE       VALUE    FLAG   ACTION          RECIPIENT
project.max-device-locked-memory
        privileged       978MB      -   deny            -
        system          16.0EB    max   deny            -

To increase the amount of privileged memory edit /etc/project file:

Default /etc/project file.

system:0::::
user.root:1::::
noproject:2::::
default:3::::
group.staff:10::::

Change to, for example 4GB.

system:0::::
user.root:1::::
noproject:2::::
default:3::::project.max-device-locked-memory=(priv, 4294967296, deny) 
group.staff:10::::