Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] sm_coll segv
From: Lenny Verkhovsky (lenny.verkhovsky_at_[hidden])
Date: 2009-08-10 11:47:12


I also have another question
$ompi_info -aa|grep mpool |grep sm
  MCA coll: parameter "coll_sm_mpool" (current value: "sm", data source:
default value)
  MCA mpool: parameter "mpool_sm_allocator" (current value: "bucket", data
source: default value)

what do these names mean, and dont they have to be the same ?
Lenny.

On Mon, Aug 10, 2009 at 5:11 PM, Lenny Verkhovsky <
lenny.verkhovsky_at_[hidden]> wrote:

> Don't these allocations of bshe->smbhe_keys require some kind of memory
> translation from 1 proc's memory space to another ( in bootstrap_init
> function /ompi/mca/coll/sm/coll_sm_module.c )
> If local rank0 allocates ( get attached to ) memory, others can't read it
> without proper tranlsation.
> Lenny
>
> On Mon, Aug 10, 2009 at 2:26 PM, Lenny Verkhovsky <
> lenny.verkhovsky_at_[hidden]> wrote:
>
>> We saw these seqv too with and without setting sm btl .
>>
>> On Fri, Aug 7, 2009 at 10:51 AM, Ralph Castain <rhc_at_[hidden]> wrote:
>>
>>>
>>>
>>> On Thu, Aug 6, 2009 at 3:18 PM, Jeff Squyres <jsquyres_at_[hidden]> wrote:
>>>
>>>> Ok, with Terry's help, I found a segv in the coll sm. If you run
>>>> without the sm btl, there's an obvious bad parameter that we're passing that
>>>> results in a segv.
>>>>
>>>> LANL -- can you confirm / deny that these are the segv's that you were
>>>> seeing?
>>>
>>>
>>> Yes we can deny that those are the segv's we were seeing - we definitely
>>> had the sm btl active. I'll rerun the test on Monday and add the stacktrace
>>> to your ticket.
>>>
>>> Ralph
>>>
>>>
>>>>
>>>> While fixing this, I noticed that the sm btl and sm coll are sharing an
>>>> mpool when both are running. This probably used to be a good idea way back
>>>> when (e.g., when we were using a lot more shmem than we needed and core
>>>> counts were lower), but it seems like a bad idea now (e.g., the btl/sm is
>>>> fairly specific about the size of the mpool that is created -- it's just big
>>>> enough for its data structures).
>>>>
>>>> I'm therefore going to change the mpool string names that btl/sm and
>>>> coll/sm are looking for so that they get unique sm mpool modules.
>>>>
>>>> --
>>>> Jeff Squyres
>>>> jsquyres_at_[hidden]
>>>>
>>>> _______________________________________________
>>>> devel mailing list
>>>> devel_at_[hidden]
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>
>>>
>>>
>>> _______________________________________________
>>> devel mailing list
>>> devel_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>
>>
>>
>