OK, with Jeff's kind help, I solved this issue in a very simple way.
Now I would like to report back the reason for this issue and the
(1) The scenario under which this issue happened:
In my OPMI environment, the $TMPDIR envar is set to different
scratch directory for different MPI process, even some MPI
processes are running on the same host. This is not troublesome if
we use openib,self,tcp btl layer for communication. However, if we
use sm btl layer, then, as Jeff said:
Open MPI creates its shared memory files in $TMPDIR. It implicitly
expects all shared memory files to be found under the same
$TMPDIR for all procs on a single machine.
More specifically, Open MPI creates what we call a "session
directory" under $TMPDIR that is an implicit rendezvous point for all
processes on the same machine. Some meta data is put in there,
to include the shared memory mmap files.
So if the different processes have a different idea of where the
rendezvous session directory exists, they'll end up blocking waiting
for others to show up at their (individual) rendezvous points... but
that will never happen, because each process is waiting at their
own rendezvous point.
So in this case, there is a block and wait on each other for MPI
processes shared data through shared memory, which will never
be released, hence the hang at the MPI_Init call.
(2) Solution to this issue:
You may set the $TMPDIR to a same directory on the same host if
possible; or you could setenv OMPI_PREFIX_ENV to a common
directory for MPI processes on the same host while keeping your
$TMPDIR setting. either way is verified and working fine for me!