Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |  

This web mail archive is frozen.

This page is part of a frozen web archive of this mailing list.

You can still navigate around this archive, but know that no new mails have been added to it since July of 2016.

Click here to be taken to the new web archives of this list; it includes all the mails that are in this frozen archive plus all new mails that have been sent to the list since it was migrated to the new archives.

Subject: Re: [OMPI users] Shared Memory - Eager VS Rendezvous
From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2012-05-23 09:05:49

On May 23, 2012, at 6:05 AM, Simone Pellegrini wrote:

>> If process A sends a message to process B and the eager protocol is used then I assume that the message is written into a shared memory area and picked up by the receiver when the receive operation is posted.

Open MPI has a few different shared memory protocols.

For short messages, they always follow what you mention above: CICO.

For large messages, we either use a pipelined CICO (as you surmised below) or use direct memory mapping if you have the Linux knem kernel module installed. More below.

>> When the rendezvous is utilized however the message still need to end up in the shared memory area somehow. I don't think any RDMA-like transfer exists for shared memory communications.

Just to clarify: RDMA = Remote Direct Memory Access, and the "remote" usually refers to a different physical address space (e.g., a different server).

In Open MPI's case, knem can use a direct memory copy between two processes.

>> Therefore you need to buffer this message somehow, however I assume that you don't buffer the whole thing but use some type of pipelined protocol so that you reduce the size of the buffer you need to keep in the shared memory.

Correct. For large messages, when using CICO, we copy the first fragment and the necessary meta data to the shmem block. When the receiver ACKs the first fragment, we pipeline CICO the rest of the large message through the shmem block. With the sender and receiver (more or less) simultaneously writing and reading to the circular shmem block, we probably won't fill it up -- meaning that the sender hypothetically won't need to block.

I'm skipping a bunch of details, but that's the general idea.

>> Is it completely wrong? It would be nice if someone could point me somewhere I can find more details about this. In the OpenMPI tuning page there are several details regarding the protocol utilized for IB but very little for SM.

Good point. I'll see if we can get some more info up there.

> I think I found the answer to my question on Jeff Squyres blog:
> However now I have a new question, how do I know if my machine uses the copyin/copyout mechanism or the direct mapping?

You need the Linux knem module. See the OMPI README and do a text search for "knem".

Jeff Squyres
For corporate legal information go to: