Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Deadlock on large numbers of processors
From: Lenny Verkhovsky (lenny.verkhovsky_at_[hidden])
Date: 2008-12-09 08:03:43


maybe it's related to https://svn.open-mpi.org/trac/ompi/ticket/1378 ??

On 12/5/08, Justin <luitjens_at_[hidden]> wrote:
>
> The reason i'd like to disable these eager buffers is to help detect the
> deadlock better. I would not run with this for a normal run but it would be
> useful for debugging. If the deadlock is indeed due to our code then
> disabling any shared buffers or eager sends would make that deadlock
> reproduceable. In addition we might be able to lower the number of
> processors down. Right now determining which processor is deadlocks when we
> are using 8K cores and each processor has hundreds of messages sent out
> would be quite difficult.
>
> Thanks for your suggestions,
> Justin
> Brock Palen wrote:
>
>> OpenMPI has differnt eager limits for all the network types, on your
>> system run:
>>
>> ompi_info --param btl all
>>
>> and look for the eager_limits
>> You can set these values to 0 using the syntax I showed you before. That
>> would disable eager messages.
>> There might be a better way to disable eager messages.
>> Not sure why you would want to disable them, they are there for
>> performance.
>>
>> Maybe you would still see a deadlock if every message was below the
>> threshold. I think there is a limit of the number of eager messages a
>> receving cpus will accept. Not sure about that though. I still kind of
>> doubt it though.
>>
>> Try tweaking your buffer sizes, make the openib btl eager limit the same
>> as shared memory. and see if you get locks up between hosts and not just
>> shared memory.
>>
>> Brock Palen
>> www.umich.edu/~brockp
>> Center for Advanced Computing
>> brockp_at_[hidden]
>> (734)936-1985
>>
>>
>>
>> On Dec 5, 2008, at 2:10 PM, Justin wrote:
>>
>> Thank you for this info. I should add that our code tends to post a lot
>>> of sends prior to the other side posting receives. This causes a lot of
>>> unexpected messages to exist. Our code explicitly matches up all tags and
>>> processors (that is we do not use MPI wild cards). If we had a dead lock I
>>> would think we would see it regardless of weather or not we cross the
>>> roundevous threshold. I guess one way to test this would be to to set this
>>> threshold to 0. If it then dead locks we would likely be able to track down
>>> the deadlock. Are there any other parameters we can send mpi that will turn
>>> off buffering?
>>>
>>> Thanks,
>>> Justin
>>>
>>> Brock Palen wrote:
>>>
>>>> When ever this happens we found the code to have a deadlock. users
>>>> never saw it until they cross the eager->roundevous threshold.
>>>>
>>>> Yes you can disable shared memory with:
>>>>
>>>> mpirun --mca btl ^sm
>>>>
>>>> Or you can try increasing the eager limit.
>>>>
>>>> ompi_info --param btl sm
>>>>
>>>> MCA btl: parameter "btl_sm_eager_limit" (current value:
>>>> "4096")
>>>>
>>>> You can modify this limit at run time, I think (can't test it right
>>>> now) it is just:
>>>>
>>>> mpirun --mca btl_sm_eager_limit 40960
>>>>
>>>> I think you can also in tweaking these values use env Vars in place of
>>>> putting it all in the mpirun line:
>>>>
>>>> export OMPI_MCA_btl_sm_eager_limit=40960
>>>>
>>>> See:
>>>> http://www.open-mpi.org/faq/?category=tuning
>>>>
>>>>
>>>> Brock Palen
>>>> www.umich.edu/~brockp
>>>> Center for Advanced Computing
>>>> brockp_at_[hidden]
>>>> (734)936-1985
>>>>
>>>>
>>>>
>>>> On Dec 5, 2008, at 12:22 PM, Justin wrote:
>>>>
>>>> Hi,
>>>>>
>>>>> We are currently using OpenMPI 1.3 on Ranger for large processor jobs
>>>>> (8K+). Our code appears to be occasionally deadlocking at random within
>>>>> point to point communication (see stacktrace below). This code has been
>>>>> tested on many different MPI versions and as far as we know it does not
>>>>> contain a deadlock. However, in the past we have ran into problems with
>>>>> shared memory optimizations within MPI causing deadlocks. We can usually
>>>>> avoid these by setting a few environment variables to either increase the
>>>>> size of shared memory buffers or disable shared memory optimizations all
>>>>> together. Does OpenMPI have any known deadlocks that might be causing our
>>>>> deadlocks? If are there any work arounds? Also how do we disable shared
>>>>> memory within OpenMPI?
>>>>>
>>>>> Here is an example of where processors are hanging:
>>>>>
>>>>> #0 0x00002b2df3522683 in mca_btl_sm_component_progress () from
>>>>> /opt/apps/intel10_1/openmpi/1.3/lib/openmpi/mca_btl_sm.so
>>>>> #1 0x00002b2df2cb46bf in mca_bml_r2_progress () from
>>>>> /opt/apps/intel10_1/openmpi/1.3/lib/openmpi/mca_bml_r2.so
>>>>> #2 0x00002b2df0032ea4 in opal_progress () from
>>>>> /opt/apps/intel10_1/openmpi/1.3/lib/libopen-pal.so.0
>>>>> #3 0x00002b2ded0d7622 in ompi_request_default_wait_some () from
>>>>> /opt/apps/intel10_1/openmpi/1.3//lib/libmpi.so.0
>>>>> #4 0x00002b2ded109e34 in PMPI_Waitsome () from
>>>>> /opt/apps/intel10_1/openmpi/1.3//lib/libmpi.so.0
>>>>>
>>>>>
>>>>> Thanks,
>>>>> Justin
>>>>> _______________________________________________
>>>>> users mailing list
>>>>> users_at_[hidden]
>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>
>>>>>
>>>>>
>>>> _______________________________________________
>>>> users mailing list
>>>> users_at_[hidden]
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>
>>>
>>> _______________________________________________
>>> users mailing list
>>> users_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>>>
>>>
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>