Running on a large cluster of 8-core nodes. I understand
that the SM BTL is a "good thing". But I'm curious about
its use of memory-mapped files. I believe these files will
be in $TMPDIR, which defaults to /tmp.
In our cluster, the compute nodes are stateless, so /tmp
is actually in RAM. Keeping memory-mapped "files" in
memory seems kind of circular, although I know little
about these things. A bigger problem is that it appears
OMPI does not remove the files upon completion.
Another option is to redefine $TMPDIR to point to a
"real" file system. In our cluster, all the available
file systems are accessed over the IB fabric. So it
seems that there will be IB traffic, even though the
point of the SM BTL is to avoid this traffic.
Given the above two constraints, might it just be
better to disable the SM BTL entirely, and use the
IB BTL even within a node? Of course, the "self"
BTL should still be used if appropriate.
Any thoughts clarifying these issues would be
greatly appreciated. Thanks!
User Services Group email: dpturner_at_[hidden]
NERSC Division phone: (510) 486-4027
Lawrence Berkeley Lab fax: (510) 486-4316