Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |  

This web mail archive is frozen.

This page is part of a frozen web archive of this mailing list.

You can still navigate around this archive, but know that no new mails have been added to it since July of 2016.

Click here to be taken to the new web archives of this list; it includes all the mails that are in this frozen archive plus all new mails that have been sent to the list since it was migrated to the new archives.

Subject: Re: [OMPI devel] sm_coll segv
From: Lenny Verkhovsky (lenny.verkhovsky_at_[hidden])
Date: 2009-08-10 10:11:35


Don't these allocations of bshe->smbhe_keys require some kind of memory
translation from 1 proc's memory space to another ( in bootstrap_init
function /ompi/mca/coll/sm/coll_sm_module.c )
If local rank0 allocates ( get attached to ) memory, others can't read it
without proper tranlsation.
Lenny

On Mon, Aug 10, 2009 at 2:26 PM, Lenny Verkhovsky <
lenny.verkhovsky_at_[hidden]> wrote:

> We saw these seqv too with and without setting sm btl .
>
> On Fri, Aug 7, 2009 at 10:51 AM, Ralph Castain <rhc_at_[hidden]> wrote:
>
>>
>>
>> On Thu, Aug 6, 2009 at 3:18 PM, Jeff Squyres <jsquyres_at_[hidden]> wrote:
>>
>>> Ok, with Terry's help, I found a segv in the coll sm. If you run without
>>> the sm btl, there's an obvious bad parameter that we're passing that results
>>> in a segv.
>>>
>>> LANL -- can you confirm / deny that these are the segv's that you were
>>> seeing?
>>
>>
>> Yes we can deny that those are the segv's we were seeing - we definitely
>> had the sm btl active. I'll rerun the test on Monday and add the stacktrace
>> to your ticket.
>>
>> Ralph
>>
>>
>>>
>>> While fixing this, I noticed that the sm btl and sm coll are sharing an
>>> mpool when both are running. This probably used to be a good idea way back
>>> when (e.g., when we were using a lot more shmem than we needed and core
>>> counts were lower), but it seems like a bad idea now (e.g., the btl/sm is
>>> fairly specific about the size of the mpool that is created -- it's just big
>>> enough for its data structures).
>>>
>>> I'm therefore going to change the mpool string names that btl/sm and
>>> coll/sm are looking for so that they get unique sm mpool modules.
>>>
>>> --
>>> Jeff Squyres
>>> jsquyres_at_[hidden]
>>>
>>> _______________________________________________
>>> devel mailing list
>>> devel_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>
>>
>>
>> _______________________________________________
>> devel mailing list
>> devel_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>
>
>