Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] sm_coll segv
From: Ralph Castain (rhc_at_[hidden])
Date: 2009-08-07 03:51:45


On Thu, Aug 6, 2009 at 3:18 PM, Jeff Squyres <jsquyres_at_[hidden]> wrote:

> Ok, with Terry's help, I found a segv in the coll sm. If you run without
> the sm btl, there's an obvious bad parameter that we're passing that results
> in a segv.
>
> LANL -- can you confirm / deny that these are the segv's that you were
> seeing?

Yes we can deny that those are the segv's we were seeing - we definitely had
the sm btl active. I'll rerun the test on Monday and add the stacktrace to
your ticket.

Ralph

>
> While fixing this, I noticed that the sm btl and sm coll are sharing an
> mpool when both are running. This probably used to be a good idea way back
> when (e.g., when we were using a lot more shmem than we needed and core
> counts were lower), but it seems like a bad idea now (e.g., the btl/sm is
> fairly specific about the size of the mpool that is created -- it's just big
> enough for its data structures).
>
> I'm therefore going to change the mpool string names that btl/sm and
> coll/sm are looking for so that they get unique sm mpool modules.
>
> --
> Jeff Squyres
> jsquyres_at_[hidden]
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>