Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Error - BTLs attempted: self sm - on a cluster with IB and openib btl enabled
From: Gus Correa (gus_at_[hidden])
Date: 2013-08-12 18:35:02


Hi Ralph

Sorry if this is more of an IB than an OMPI problem,
but my view angle shows it through the OMPI jobs failing.

Yes, indeed I was setting memlock to unlimited in limits.conf
and in the pbs_mom, restarting everything, relaunching the job.
The error message changes, but it still fails on Infiniband,
now complaining about the IB driver, but also that it cannot
allocate memory.

Weird because when I ssh to the node and do ibstat it
responds (see below, please).
I actually ran ibstat everywhere, and all IB host adapters seem OK.

Thank you,
Gus Correa

*********************** the job stderr ******************************
unable to alloc 512 bytes
Abort: Command not found.
unable to realloc 1600 bytes
Abort: Command not found.
libibverbs: Warning: couldn't load driver 'mlx4': libmlx4-rdmav2.so:
failed to map segment from shared object: Cannot allocate memory
libibverbs: Warning: couldn't load driver 'nes': libnes-rdmav2.so:
failed to map segment from shared object: Cannot allocate memory
libibverbs: Warning: couldn't load driver 'cxgb3': libcxgb3-rdmav2.so:
failed to map segment from shared object: Cannot allocate memory
libibverbs: Warning: couldn't load driver 'mthca': libmthca-rdmav2.so:
failed to map segment from shared object: Cannot allocate memory
libibverbs: Warning: couldn't load driver 'ipathverbs':
libipathverbs-rdmav2.so: failed to map segment from shared object:
Cannot allocate memory
libibverbs: Warning: no userspace device-specific driver found for
/sys/class/infiniband_verbs/uverbs0
libibverbs: Warning: couldn't load driver 'mlx4': libmlx4-rdmav2.so:
failed to map segment from shared object: Cannot allocate memory
libibverbs: Warning: couldn't load driver 'nes': libnes-rdmav2.so:
failed to map segment from shared object: Cannot allocate memory
libibverbs: Warning: couldn't load driver 'cxgb3': libcxgb3-rdmav2.so:
failed to map segment from shared object: Cannot allocate memory
libibverbs: Warning: couldn't load driver 'mthca': libmthca-rdmav2.so:
failed to map segment from shared object: Cannot allocate memory
libibverbs: Warning: couldn't load driver 'ipathverbs':
libipathverbs-rdmav2.so: failed to map segment from shared object:
Cannot allocate memory
[node15:29683] *** Process received signal ***
[node15:29683] Signal: Segmentation fault (11)
[node15:29683] Signal code: (128)
[node15:29683] Failing at address: (nil)
[node15:29683] *** End of error message ***
--------------------------------------------------------------------------
mpiexec noticed that process rank 0 with PID 29683 on node
node15.cluster exited on signal 11 (Segmentation fault).
--------------------------------------------------------------------------
[node15.cluster:29682] [[7785,0],0]-[[7785,1],2] mca_oob_tcp_msg_recv:
readv failed: Connection reset by peer (104)
************************************************************

*************** ibstat on node15 *************************

[root_at_node15 ~]# ibstat
CA 'mlx4_0'
        CA type: MT26428
        Number of ports: 1
        Firmware version: 2.7.700
        Hardware version: b0
        Node GUID: 0x002590ffff16284c
        System image GUID: 0x002590ffff16284f
        Port 1:
                State: Active
                Physical state: LinkUp
                Rate: 40
                Base lid: 11
                LMC: 0
                SM lid: 1
                Capability mask: 0x02510868
                Port GUID: 0x002590ffff16284d
                Link layer: IB

************************************************************

On 08/12/2013 05:29 PM, Ralph Castain wrote:
> No, this has nothing to do with the registration limit.
> For some reason, the system is refusing to create a thread -
> i.e., it is pthread_create that is failing.
> I have no idea what would be causing that to happen.
>
> Try setting it to unlimited and see if it allows the thread
> to start, I guess.
>
>
> On Aug 12, 2013, at 2:20 PM, Gus Correa<gus_at_[hidden]> wrote:
>
>> Hi Ralph, all
>>
>> I include more information below,
>> after turning on btl_openib_verbose 30.
>> As you can see, OMPI tries, and fails, to load openib.
>>
>> Last week I reduced the memlock limit from unlimited
>> to ~12GB, as part of a general attempt to reign on memory
>> use/abuse by jobs sharing a node.
>> No parallel job ran until today, when the problem showed up.
>> Could the memlock limit be the root of the problem?
>>
>> The OMPI FAQ says the memlock limit
>> should be a "large number (or better yet, unlimited)":
>>
>> http://www.open-mpi.org/faq/?category=openfabrics#ib-locked-pages
>>
>> The next two FAQ kind of indicate that
>> it should be set to "unlimited", but don't say it clearly:
>>
>> http://www.open-mpi.org/faq/?category=openfabrics#ib-locked-pages-user
>> http://www.open-mpi.org/faq/?category=openfabrics#ib-locked-pages-more
>>
>> QUESTION:
>> Is "unlimited" a must, or is there any (magic) "large number"
>> that would be OK for openib?
>>
>> I thought a 12GB memlock limit would be OK, but maybe it is not.
>> The nodes have 64GB RAM.
>>
>> Thank you,
>> Gus Correa
>>
>> *************************************************\
>> [node15.cluster][[8097,1],0][../../../../../ompi/mca/btl/openib/btl_openib_component.c:562:start_async_event_thread] Failed to create async event thread
>> [node15.cluster][[8097,1],1][../../../../../ompi/mca/btl/openib/btl_openib_component.c:562:start_async_event_thread] Failed to create async event thread
>> [node15.cluster][[8097,1],4][../../../../../ompi/mca/btl/openib/btl_openib_component.c:562:start_async_event_thread] Failed to create async event thread
>> [node15.cluster][[8097,1],3][../../../../../ompi/mca/btl/openib/btl_openib_component.c:562:start_async_event_thread] Failed to create async event thread
>> [node15.cluster][[8097,1],2][../../../../../ompi/mca/btl/openib/btl_openib_component.c:562:start_async_event_thread] Failed to create async event thread
>> --------------------------------------------------------------------------
>> WARNING: There was an error initializing an OpenFabrics device.
>>
>> Local host: node15.cluster
>> Local device: mlx4_0
>> --------------------------------------------------------------------------
>> [node15.cluster][[8097,1],10][../../../../../ompi/mca/btl/openib/btl_openib_component.c:562:start_async_event_thread] Failed to create async event thread
>> [node15.cluster][[8097,1],12][../../../../../ompi/mca/btl/openib/btl_openib_component.c:562:start_async_event_thread] Failed to create async event thread
>> [node15.cluster][[8097,1],13][../../../../../ompi/mca/btl/openib/btl_openib_component.c:562:start_async_event_thread] Failed to create async event thread
>> [node14.cluster][[8097,1],17][../../../../../ompi/mca/btl/openib/btl_openib_component.c:562:start_async_event_thread] Failed to create async event thread
>> [node14.cluster][[8097,1],23][../../../../../ompi/mca/btl/openib/btl_openib_component.c:562:start_async_event_thread] Failed to create async event thread
>> [node14.cluster][[8097,1],24][../../../../../ompi/mca/btl/openib/btl_openib_component.c:562:start_async_event_thread] Failed to create async event thread
>> [node14.cluster][[8097,1],26][../../../../../ompi/mca/btl/openib/btl_openib_component.c:562:start_async_event_thread] Failed to create async event thread
>> [node14.cluster][[8097,1],28][../../../../../ompi/mca/btl/openib/btl_openib_component.c:562:start_async_event_thread] Failed to create async event thread
>> [node14.cluster][[8097,1],31][../../../../../ompi/mca/btl/openib/btl_openib_component.c:562:start_async_event_thread] Failed to create async event thread
>> --------------------------------------------------------------------------
>> At least one pair of MPI processes are unable to reach each other for
>> MPI communications. This means that no Open MPI device has indicated
>> that it can be used to communicate between these processes. This is
>> an error; Open MPI requires that all MPI processes be able to reach
>> each other. This error can sometimes be the result of forgetting to
>> specify the "self" BTL.
>>
>> Process 1 ([[8097,1],4]) is on host: node15.cluster
>> Process 2 ([[8097,1],16]) is on host: node14
>> BTLs attempted: self sm
>>
>> Your MPI job is now going to abort; sorry.
>> --------------------------------------------------------------------------
>>
>> *************************************************
>>
>> On 08/12/2013 03:32 PM, Gus Correa wrote:
>>> Thank you for the prompt help, Ralph!
>>>
>>> Yes, it is OMPI 1.4.3 built with openib support:
>>>
>>> $ ompi_info | grep openib
>>> MCA btl: openib (MCA v2.0, API v2.0, Component v1.4.3)
>>>
>>> There are three libraries in prefix/lib/openmpi,
>>> no mca_btl_openib library.
>>>
>>> $ ls $PREFIX/lib/openmpi/
>>> libompi_dbg_msgq.a libompi_dbg_msgq.la libompi_dbg_msgq.so
>>>
>>>
>>> However, this may be just because it is an older OMPI version in
>>> the 1.4 series.
>>> Because those are exactly what I have in another cluster with IB,
>>> and OMPI 1.4.3, where there isn't a problem.
>>> The libraries' organization may have changed from
>>> the 1.4 to the 1.6 series, right?
>>> I only have mca_btl_openib libraries in the 1.6 series, but it
>>> will be a hardship to migrate this program to OMPI 1.6.
>>>
>>> (OK, I have newer OMPI, but I need old also for some
>>> programs).
>>>
>>> Why the heck it is not detecting the Infinband hardware?
>>> [It used to detect it! :( ]
>>>
>>> Thank you,
>>> Gus Correa
>>>
>>>
>>> On 08/12/2013 03:01 PM, Ralph Castain wrote:
>>>> Check ompi_info - was it built with openib support?
>>>>
>>>> Then check that the mca_btl_openib library is present in the
>>>> prefix/lib/openmpi directory
>>>>
>>>> Sounds like it isn't finding the openib plugin
>>>>
>>>>
>>>> On Aug 12, 2013, at 11:57 AM, Gus Correa<gus_at_[hidden]> wrote:
>>>>
>>>>> Dear Open MPI pros
>>>>>
>>>>> On one of the clusters here, that has Infinband,
>>>>> I am getting this type of errors from
>>>>> OpenMPI 1.4.3 (OK, I know it is old ...):
>>>>>
>>>>> *********************************************************
>>>>> Tcl_InitNotifier: unable to start notifier thread
>>>>> Abort: Command not found.
>>>>> Tcl_InitNotifier: unable to start notifier thread
>>>>> Abort: Command not found.
>>>>> --------------------------------------------------------------------------
>>>>>
>>>>> At least one pair of MPI processes are unable to reach each other for
>>>>> MPI communications. This means that no Open MPI device has indicated
>>>>> that it can be used to communicate between these processes. This is
>>>>> an error; Open MPI requires that all MPI processes be able to reach
>>>>> each other. This error can sometimes be the result of forgetting to
>>>>> specify the "self" BTL.
>>>>>
>>>>> Process 1 ([[907,1],68]) is on host: node11.cluster
>>>>> Process 2 ([[907,1],0]) is on host: node15
>>>>> BTLs attempted: self sm
>>>>>
>>>>> Your MPI job is now going to abort; sorry.
>>>>> --------------------------------------------------------------------------
>>>>>
>>>>> *********************************************************
>>>>>
>>>>> Awkward, because I have "btl = ^tcp" in openmpi-mca-params.conf.
>>>>> The same error also happens if I force --mca btl openib,sm,self
>>>>> in mpiexec.
>>>>>
>>>>> ** Why is it attempting only the self and sm BTLs, but not openib? **
>>>>>
>>>>> I don't understand either the initial errors
>>>>> "Tcl_InitNotifier: unable to start notifier thread".
>>>>> Are they coming from Torque perhaps?
>>>>>
>>>>> As I said, the cluster has Infiniband,
>>>>> which is what we've been using forever, until
>>>>> these errors started today.
>>>>>
>>>>> When I divert the traffic to tcp
>>>>> (--mca btl tcp,sm,self), the jobs run normally.
>>>>>
>>>>> I am using the examples/connectivity_c.c program
>>>>> to troubleshoot this problem.
>>>>>
>>>>> ***
>>>>> I checked a few things on the IB side.
>>>>>
>>>>> The output of ibstat on all nodes seems OK (links up, etc),
>>>>> and so are the output of ibhosts and ibchecknet.
>>>>>
>>>>> Only two connected ports had errors, as reported by ibcheckerrors,
>>>>> and I cleared them with iblclearerrors.
>>>>>
>>>>> The IB subnet manager is running on the head node.
>>>>> I restarted the daemon, but nothing changed, the job continue to
>>>>> fail with the same errors.
>>>>>
>>>>> **
>>>>>
>>>>> Any hints of what is going on, how to diagnose it, and how to fix it?
>>>>> Any gentler way than reboot everything and power cycling
>>>>> the IB switch? (And would this brute force method work, at least?)
>>>>>
>>>>> Thank you,
>>>>> Gus Correa
>>>>> _______________________________________________
>>>>> users mailing list
>>>>> users_at_[hidden]
>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>
>>>> _______________________________________________
>>>> users mailing list
>>>> users_at_[hidden]
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>>
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users