Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] size of shared-memory backing file + maffinity
From: Eugene Loh (Eugene.Loh_at_[hidden])
Date: 2009-01-13 14:29:50

Lenny Verkhovsky wrote:

>Actually the size is suppose to be the same,
Yes, I would think that that is how it is supposed to work.

>It just suppose to bind process to it's closer memory node, instead of
>leaving it to OS.
>mpool_sm_module.c:82: opal_maffinity_base_bind(&mseg, 1, mpool_sm->mem_node);
But this is not the code I'm concerned about. Sorry I was not clearer.
I'm not concerned about how much memory is being allocated within the
shared area. I'm concerned about how big the shared area is... how big
the files are that are sitting in /tmp and are being mmapped into the
processes' address spaces.

If I look in sm_btl_first_time_init(), I find a loop from i=0 to
i<num_mem_nodes. In each iteration, I call
mca_mpool_base_module_create(), which in turn calls
mca_mpool_sm_init(). This function appears to create a file of size
num_local_procs*per_peer_size and mmaps it into each process.

E.g., let's say we have 4 on-node processes and per_peer_size has the
default 32 MB value. So, presumably my shared file in /tmp that I map
into each process is 128 MB. But, if there are multiple memory nodes,
then I will have such a file for each memory node... possibly 4 of them
for a grand total of 512 MB of shared space.

Does that explain my concern any better?

>On Mon, Jan 12, 2009 at 10:02 PM, Eugene Loh <Eugene.Loh_at_[hidden]> wrote:
>>I'm trying to understand how much shared memory is allocated when maffinity
>>is on.
>>The sm BTL sets up a file that is mmapped into each local process's address
>>space so that the processes on a node can communicate via shared memory.
>>Actually, when maffinity indicates that there are multiple "memory nodes" on
>>the node, then a separate file is set up and mmapped for each "memory node".
>>There is an MCA parameter named "[mpool_sm_per_]peer_size", which by default
>>is 32 Mbytes. The idea is that there are n processes on the node, then the
>>size of the file to be mmapped in is n*32M.
>>But, if there are multiple "memory nodes", will there be that much more
>>shared memory? That is, is the total amount of shared memory that's mmapped
>>into all the processes:
>> mem_nodes * num_local_procs * peer_size
>>Or, should the file for a memory node be created with size proportional to
>>the number of processes that correspond to that memory node?