Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] 1.6.4rc5: final rc
From: Jeff Squyres (jsquyres) (jsquyres_at_[hidden])
Date: 2013-02-20 16:17:23


If someone wants to submit a patch in the immediate future (i.e., within the next hour), great.

Otherwise, I'm still going to release 1.6.4 as-is.

If someone wants to submit a patch after 1.6.4 is out, that's fine -- if we ever do 1.6.5, it can go in there.

On Feb 20, 2013, at 4:09 PM, Nathan Hjelm <hjelmn_at_[hidden]> wrote:

> On Wed, Feb 20, 2013 at 10:28:56AM -0800, Eugene Loh wrote:
>> On 02/20/13 07:54, Jeff Squyres (jsquyres) wrote:
>>> All MTT testing looks good for 1.6.4. There seems to be an MPI dynamics problem when --enable-spare-groups is used, but this does not look like a regression to me.
>>>
>>> I put out a final rc, because there was one more minor change to accommodate an MXM API change; it's in the usual place:
>>>
>>> http://www.open-mpi.org/software/ompi/v1.6/
>>>
>>> Unless something disastrous happens, I plan to release this as the final 1.6.4 tomorrow.
>>
>> I don't think this qualifies as "disastrous", but...
>>
>> I've been trying to do some 1.6 testing on Solaris. (Solaris 11,
>> Oracle Studio compilers, both SPARC and x86) Results generally look
>> good. The main issue appears to be:
>>
>> - SPARC
>> *AND*
>> - compile with "-m32 -xmemalign=8s" (the latter means assume at most 8-byte alignment, with sigbus for misalignment)
>> *AND*
>> - openib
>>
>> There is a sigbus during MPI_Init. Specifically, if I go to btl_openib_frag.h out_constructor(), I see:
>>
>> frag->sr_desc.wr_id = (uint64_t)(uintptr_t)frag;
>>
>> and the left-hand side is on a 4-byte (but not 8-byte) boundary. How hard would it be to get openib frags on 8-byte boundaries?
>
> Very easy. Just adjust the parameters given to ompi_free_list_init(). There are arguments for frag alignment and data alignment. Looking at btl_openib_component.c a number of free lists have the alignment set at 2. Change those to 8 and see if that fixes the problem.
>
> Anyone know why these were set with an alignment of 2 in the first place? I would have expected 8 or opal_cache_line_size.
>
> -Nathan
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel

-- 
Jeff Squyres
jsquyres_at_[hidden]
For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/