Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Deadlock on large numbers of processors
From: Lenny Verkhovsky (lenny.verkhovsky_at_[hidden])
Date: 2008-12-09 08:05:19


also see https://svn.open-mpi.org/trac/ompi/ticket/1449

On 12/9/08, Lenny Verkhovsky <lenny.verkhovsky_at_[hidden]> wrote:
>
> maybe it's related to https://svn.open-mpi.org/trac/ompi/ticket/1378 ??
>
> On 12/5/08, Justin <luitjens_at_[hidden]> wrote:
>>
>> The reason i'd like to disable these eager buffers is to help detect the
>> deadlock better. I would not run with this for a normal run but it would be
>> useful for debugging. If the deadlock is indeed due to our code then
>> disabling any shared buffers or eager sends would make that deadlock
>> reproduceable. In addition we might be able to lower the number of
>> processors down. Right now determining which processor is deadlocks when we
>> are using 8K cores and each processor has hundreds of messages sent out
>> would be quite difficult.
>>
>> Thanks for your suggestions,
>> Justin
>> Brock Palen wrote:
>>
>>> OpenMPI has differnt eager limits for all the network types, on your
>>> system run:
>>>
>>> ompi_info --param btl all
>>>
>>> and look for the eager_limits
>>> You can set these values to 0 using the syntax I showed you before. That
>>> would disable eager messages.
>>> There might be a better way to disable eager messages.
>>> Not sure why you would want to disable them, they are there for
>>> performance.
>>>
>>> Maybe you would still see a deadlock if every message was below the
>>> threshold. I think there is a limit of the number of eager messages a
>>> receving cpus will accept. Not sure about that though. I still kind of
>>> doubt it though.
>>>
>>> Try tweaking your buffer sizes, make the openib btl eager limit the
>>> same as shared memory. and see if you get locks up between hosts and not
>>> just shared memory.
>>>
>>> Brock Palen
>>> www.umich.edu/~brockp
>>> Center for Advanced Computing
>>> brockp_at_[hidden]
>>> (734)936-1985
>>>
>>>
>>>
>>> On Dec 5, 2008, at 2:10 PM, Justin wrote:
>>>
>>> Thank you for this info. I should add that our code tends to post a lot
>>>> of sends prior to the other side posting receives. This causes a lot of
>>>> unexpected messages to exist. Our code explicitly matches up all tags and
>>>> processors (that is we do not use MPI wild cards). If we had a dead lock I
>>>> would think we would see it regardless of weather or not we cross the
>>>> roundevous threshold. I guess one way to test this would be to to set this
>>>> threshold to 0. If it then dead locks we would likely be able to track down
>>>> the deadlock. Are there any other parameters we can send mpi that will turn
>>>> off buffering?
>>>>
>>>> Thanks,
>>>> Justin
>>>>
>>>> Brock Palen wrote:
>>>>
>>>>> When ever this happens we found the code to have a deadlock. users
>>>>> never saw it until they cross the eager->roundevous threshold.
>>>>>
>>>>> Yes you can disable shared memory with:
>>>>>
>>>>> mpirun --mca btl ^sm
>>>>>
>>>>> Or you can try increasing the eager limit.
>>>>>
>>>>> ompi_info --param btl sm
>>>>>
>>>>> MCA btl: parameter "btl_sm_eager_limit" (current value:
>>>>> "4096")
>>>>>
>>>>> You can modify this limit at run time, I think (can't test it right
>>>>> now) it is just:
>>>>>
>>>>> mpirun --mca btl_sm_eager_limit 40960
>>>>>
>>>>> I think you can also in tweaking these values use env Vars in place of
>>>>> putting it all in the mpirun line:
>>>>>
>>>>> export OMPI_MCA_btl_sm_eager_limit=40960
>>>>>
>>>>> See:
>>>>> http://www.open-mpi.org/faq/?category=tuning
>>>>>
>>>>>
>>>>> Brock Palen
>>>>> www.umich.edu/~brockp
>>>>> Center for Advanced Computing
>>>>> brockp_at_[hidden]
>>>>> (734)936-1985
>>>>>
>>>>>
>>>>>
>>>>> On Dec 5, 2008, at 12:22 PM, Justin wrote:
>>>>>
>>>>> Hi,
>>>>>>
>>>>>> We are currently using OpenMPI 1.3 on Ranger for large processor jobs
>>>>>> (8K+). Our code appears to be occasionally deadlocking at random within
>>>>>> point to point communication (see stacktrace below). This code has been
>>>>>> tested on many different MPI versions and as far as we know it does not
>>>>>> contain a deadlock. However, in the past we have ran into problems with
>>>>>> shared memory optimizations within MPI causing deadlocks. We can usually
>>>>>> avoid these by setting a few environment variables to either increase the
>>>>>> size of shared memory buffers or disable shared memory optimizations all
>>>>>> together. Does OpenMPI have any known deadlocks that might be causing our
>>>>>> deadlocks? If are there any work arounds? Also how do we disable shared
>>>>>> memory within OpenMPI?
>>>>>>
>>>>>> Here is an example of where processors are hanging:
>>>>>>
>>>>>> #0 0x00002b2df3522683 in mca_btl_sm_component_progress () from
>>>>>> /opt/apps/intel10_1/openmpi/1.3/lib/openmpi/mca_btl_sm.so
>>>>>> #1 0x00002b2df2cb46bf in mca_bml_r2_progress () from
>>>>>> /opt/apps/intel10_1/openmpi/1.3/lib/openmpi/mca_bml_r2.so
>>>>>> #2 0x00002b2df0032ea4 in opal_progress () from
>>>>>> /opt/apps/intel10_1/openmpi/1.3/lib/libopen-pal.so.0
>>>>>> #3 0x00002b2ded0d7622 in ompi_request_default_wait_some () from
>>>>>> /opt/apps/intel10_1/openmpi/1.3//lib/libmpi.so.0
>>>>>> #4 0x00002b2ded109e34 in PMPI_Waitsome () from
>>>>>> /opt/apps/intel10_1/openmpi/1.3//lib/libmpi.so.0
>>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>> Justin
>>>>>> _______________________________________________
>>>>>> users mailing list
>>>>>> users_at_[hidden]
>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>
>>>>>>
>>>>>>
>>>>> _______________________________________________
>>>>> users mailing list
>>>>> users_at_[hidden]
>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>
>>>>
>>>> _______________________________________________
>>>> users mailing list
>>>> users_at_[hidden]
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>
>>>>
>>>>
>>> _______________________________________________
>>> users mailing list
>>> users_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>>
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>
>