On Wed, Feb 20, 2013 at 10:28:56AM -0800, Eugene Loh wrote:
> On 02/20/13 07:54, Jeff Squyres (jsquyres) wrote:
> >All MTT testing looks good for 1.6.4. There seems to be an MPI dynamics problem when --enable-spare-groups is used, but this does not look like a regression to me.
> >I put out a final rc, because there was one more minor change to accommodate an MXM API change; it's in the usual place:
> > http://www.open-mpi.org/software/ompi/v1.6/
> >Unless something disastrous happens, I plan to release this as the final 1.6.4 tomorrow.
> I don't think this qualifies as "disastrous", but...
> I've been trying to do some 1.6 testing on Solaris. (Solaris 11,
> Oracle Studio compilers, both SPARC and x86) Results generally look
> good. The main issue appears to be:
> - SPARC
> - compile with "-m32 -xmemalign=8s" (the latter means assume at most 8-byte alignment, with sigbus for misalignment)
> - openib
> There is a sigbus during MPI_Init. Specifically, if I go to btl_openib_frag.h out_constructor(), I see:
> frag->sr_desc.wr_id = (uint64_t)(uintptr_t)frag;
> and the left-hand side is on a 4-byte (but not 8-byte) boundary. How hard would it be to get openib frags on 8-byte boundaries?
Very easy. Just adjust the parameters given to ompi_free_list_init(). There are arguments for frag alignment and data alignment. Looking at btl_openib_component.c a number of free lists have the alignment set at 2. Change those to 8 and see if that fixes the problem.
Anyone know why these were set with an alignment of 2 in the first place? I would have expected 8 or opal_cache_line_size.