That's a relief to know, although I'm still a bit concerned. I'm looking at
the code for the OpenMPI 1.3 trunk and in the ob1 component I can see the
mca_pml_ob1_recv_frag_callback_match -> append_frag_to_list ->
MCA_PML_OB1_RECV_FRAG_ALLOC -> OMPI_FREE_LIST_WAIT -> __ompi_free_list_wait
so I'm guessing unless the deadlock issue has been resolved for that
function, it will still fail non deterministically. I'm quite eager to give
it a try, but my component doesn't compile as is with the 1.3 source. Is it
trivial to convert it?
Or maybe you were suggesting that I go into the code of ob1 myself and
manually change every _wait to _get?
2009/3/23 George Bosilca <bosilca_at_[hidden]>
> It is a known problem. When the freelist is empty going in the
> ompi_free_list_wait will block the process until at least one fragment
> became available. As a fragment can became available only when returned by
> the BTL, this can lead to deadlocks in some cases. The workaround is to ban
> the usage of the blocking _wait function, and replace it with the
> non-blocking version _get. The PML has all the required logic to deal with
> the cases where a fragment cannot be allocated. We changed most of the BTLs
> to use _get instead of _wait few months ago.
> On Mar 23, 2009, at 11:58 , Timothy Hayes wrote:
>> I'm working on an OpenMPI BTL component and am having a recurring problem,
>> I was wondering if anyone could shed some light on it. I have a component
>> that's quite straight forward, it uses a pair of lightweight sockets to take
>> advantage of being in a virtualised environment (specifically Xen). My code
>> is a bit messy and has lots of inefficiencies, but the logic seems sound
>> enough. I've been able to execute a few simple programs successfully using
>> the component, and they work most of the time.
>> The problem I'm having is actually happening in higher layers,
>> specifically in my asynchronous receive handler, when I call the callback
>> function (cbfunc) that was set by the PML in the BTL initialisation phase.
>> It seems to be getting stuck in an infinite loop at __ompi_free_list_wait(),
>> in this function there is a condition variable which should get set
>> eventually but just doesn't. I've stepped through it with GDB and I get a
>> backtrace of something like this:
>> mca_btl_xen_endpoint_recv_handler -> mca_btl_xen_endpoint_start_recv ->
>> mca_pml_ob1_recv_frag_callback -> mca_pml_ob1_recv_frag_match ->
>> __ompi_free_list_wait -> opal_condition_wait
>> and from there it just loops. Although this is happening in higher levels,
>> I haven't noticed something like this happening in any of the other BTL
>> components so chances are there's something in my code that's causing this.
>> I very much doubt that it's actually waiting for a list item to be returned
>> since this infinite loop can occur non deterministically and sometimes even
>> on the first receive callback.
>> I'm really not too sure what else to include with this e-mail. I could
>> send my source code (a bit nasty right now) if it would be helpful, but I'm
>> hoping that someone might have noticed this problem before or something
>> similar. Maybe I'm making a common mistake. Any advice would be really
>> I'm using OpenMPI 1.2.9 from the SVN tag repository.
>> Kind regards
>> Tim Hayes
>> devel mailing list
> devel mailing list