Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

From: Andrew Friedley (afriedle_at_[hidden])
Date: 2007-10-23 12:46:06


Troy Telford wrote:
> On Monday 22 October 2007, Don Kerr wrote:
>> Couple of things.
>> With linux I believe you need the interface instance in the 7th field of
>> the /etc/dat.conf file.
>> example:
>>
>> InfiniHost0 u1.1 nonthreadsafe default /usr/lib64/libdapl.so ri.1.1 " " " "
>> should be
>> InfiniHost0 u1.1 nonthreadsafe default /usr/lib64/libdapl.so ri.1.1 "ib0 0
>> " " "
>
> Yeah, I noticed that when I compared it to an OFED install; in this particular
> IB stack, the "blank" field goes to the "default" value, so while it looks
> like an issue, it isn't. (On this stack, the blank is equivalent of
> having "InfiniHost0 ib1" - and yes, the names aren't what you'd expect, which
> makes support even more difficult...)
>
>> Also, I did see a problem when running with less than ofed 1.2 which I
>> did not pursue because v1.2 worked. Last, it appears that you are
>> running udapl 1.1, I have only ever run on 1.2 so I don't know what to
>> expect.
>
> You're 100% correct that it is uDAPL 1.1. If you've only used 1.2 (and hence,
> coded/tested for uDAPL 1.2), then I wouldn't be suprised if that's the reason
> why it isn't working.

Actually I'm not really convinced this is the reason. The error you've
given occurs very early, in fact it's the first interface-specific DAT
function call. A quick look at the v1.2 spec indicates nothing relevant
changed between v1.1 and v1.2.

As the error message indicates this usually means something isn't right
with your uDAPL installation, but the fact that Intel MPI works is kind
of strange -- are you sure it was using uDAPL? Unfortunately at this
point I'm not sure what to suggest to try.

Also does Don's test code that you tried call dat_ia_open()? If not,
then it wouldn't reproduce the error you're getting.

Which uDAPL implementation are you using? I originally started
developing the uDAPL BTL using Myricom's implementation for GM, but
abandoned it because it was far too broken to use.

Andrew

>
> MVAPICH2's uDAPL support is 1.2 or greater, so I wouldn't be suprised if the
> story is similar for Open MPI.
>
> I'd add that it may be useful to others to mention what version(s) of uDAPL
> work with Open MPI in the documentation or FAQ.