This web mail archive is frozen.
This page is part of a frozen web archive of this mailing list.
You can still navigate around this archive, but know that no new mails
have been added to it since July of 2016.
Click here to be taken to the new web archives of this list; it includes all the mails that are in this frozen archive plus all new mails that have been sent to the list since it was migrated to the new archives.
On Mar 7, 2010, at 2:55 AM, Gijsbert Wiesenekker wrote:
> I was having non-reproducible hangs in an OpenMPI program. While troubleshooting this problem I found that there were many temporary directories in my /tmp/openmpi-sessions-userid directory (probably the result of MPI_Abort aborted OpenMPI programs). I cleaned those directories up and it looks like the hangs have gone.
> My questions are:
> It looks like the name of the temporary directory in /tmp/openmpi-sessions-userid directory is a process-id. What happens when an OpenMPI program starts and the temporary directory in /tmp/openmpi-sessions-userid already exists?
It'll just overwrite what is already there. I confess that the code has not been tested for that situation, though, so I can't guarantee that response.
> Could existing temporary directories in /tmp/openmpi-sessions-userid cause an OpenMPI program to hang?
Given your observations, I guess the answer has to be "yes", though I wouldn't have expected it. The typical behavior in this scenario is for the application to error out during MPI_Init when it finds that there isn't enough space in /tmp for the session directory - and that is caused not by the directory itself, but rather by the shared memory backing file that resides in the session dir tree and can be quite large.
> Is there a way to ensure that the temporary directory created in /tmp/openmpi-sessions-userid is always removed after an OpenMPI program has run?
It really, really helps to know what version you are using. We just found a bug in the 1.3/1.4 series that was causing the apps not to cleanup in exactly this scenario - that is fixed in the upcoming 1.4.2 release. The older 1.2 series had this problem as well, but we aren't going back to fix it. :-)
> users mailing list