Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] allocating sm memory with page alignment
From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2008-08-30 10:42:48


On Aug 29, 2008, at 5:52 PM, Eugene Loh wrote:

> I'm looking at the sm BTL.

Excellent! I hope you had a good dash of parmesan with that spaghetti
code in there (the sm btl is among the hairiest sections in
OMPI...). :-)

> In mca_btl_sm_add_procs(), there's a loop over peer processes, with
> a call to ompi_fifo_init(). That is, one call to ompi_fifo_init()
> for each connection
[snip]
> on page boundaries.

I *believe* your analysis is correct. It's been a while since I've
looked in detail in that section of code, but what you say sounds
reasonable.

> As the number of local processes increases, therefore these per-
> connection allocations become very costly. For 8K pages, for
> example, and 100 on-node processes, we're talking 3*100*100*8K = 240
> Mbytes. For 512 on-node processes (yes, we have nodes this big),
> that's 6 Gbyte... most of which is unused. (E.g., allocating more
> than an 8K page when we only need 64 or 12 bytes.)
>
> Okay, long intro. Let me start with a short question: do we really
> need page alignment for these allocations? Would cacheline
> alignment be okay?

I believe the main rationale for doing page-line alignments was for
memory affinity, since (at least on Linux, I don't know about solaris)
you can only affinity-ize pages.

On your big 512 proc machines, I'm assuming that the page memory
affinity will matter...?

That being said, we're certainly open to making things better. E.g.,
if a few procs share a memory locality (can you detect that in
Solaris?), have them share a page or somesuch...? (totally open to
ideas here)

-- 
Jeff Squyres
Cisco Systems