This web mail archive is frozen.
This page is part of a frozen web archive of this mailing list.
You can still navigate around this archive, but know that no new mails
have been added to it since July of 2016.
Click here to be taken to the new web archives of this list; it includes all the mails that are in this frozen archive plus all new mails that have been sent to the list since it was migrated to the new archives.
> I am pretty sure that LAM exploits the fact that the virtual processors
> are all
> sharing the same memory, so communication is via memory and/or the PCI bus
> of the system, while my OPENMPI configuration doesn't exploit this. Is this
> a reasonable diagnosis of the dramatic difference in performance? More
It would be more likely that OpenMPI is using shared memory and polling
on it whereas LAM is using sockets, or at least blocking on something.
Polling is a bad thing when oversubscribing processor. When you block on
a socket (or any OS handle), the process immediately yield the CPU and
is removed from the scheduler. When you poll waiting for a send or
receive to complete, you are burning cycles on the CPU and the scheduler
will wait for the next quantum of time before running another process.
So, if you send a message between 2 processes sharing the same
processor, the latency will be in the order of half of the scheduler
quantum (10ms on Linux) if they are both polling. Things are much faster
when processes are polling on different CPUs (1-2 us) but the blocking
socket overhead (~20us) is way better than the quantum of time when you
don't have several processors.
> importantly, how to I reconfigure OPENMPI to match the LAM performance.
Try disabling the shared memory device in OpenMPI. Unfortunately, I have
no clue how to do it.